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Foreword 


The fourth volume on Advances and Applications of Dezert-Smarandache Theory 
(DSmT) for information fusion collects theoretical and applied contributions of 
researchers working in different fields of applications and in mathematics. The 
contributions (see List of Articles published in this book, at the end of the volume) 
have been published or presented after disseminating the third volume (2009, http:// 
fs.gallup.unm.edu/DSmT-book3.pdf) in international conferences, seminars, 
workshops and journals. 


First Part of this book presents the theoretical advancement of DSmT, dealing with 
Belief functions, conditioning and deconditioning, Analytic Hierarchy Process, 
Decision Making, Multi-Criteria, evidence theory, combination rule, evidence distance, 
conflicting belief, sources of evidences with different importance and reliabilities, 
importance of sources, pignistic probability transformation, Qualitative reasoning 
under uncertainty, Imprecise belief structures, 2-Tuple linguistic label, Electre Tri 
Method, hierarchical proportional redistribution, basic belief assignment, subjective 
probability measure, neutrosophic logic, Evidence theory, outranking methods, 
Dempster-Shafer Theory, Bayes fusion rule, frequentist probability, mean square error, 
controlling factor, optimal assignment solution, data association, Transferable Belief 
Model, and others. 


More applications of DSmT have emerged in the past years since the apparition of 
the third book of DSmT 2009. Subsequently, the second part of this volume is about 
applications of DSmT in correlation with Electronic Support Measures, belief function, 
sensor networks, Ground Moving Target and Multiple target tracking, Vehicle-Born 
Improvised Explosive Device, Belief Interacting Multiple Model filter, seismic and 
acoustic sensor, Support Vector Machines, Alarm classification, ability of human 
visual system, Uncertainty Representation and Reasoning Evaluation Framework, 
Threat Assessment, Handwritten Signature Verification, Automatic Aircraft 
Recognition, Dynamic Data-Driven Application System, adjustment of secure 
communication trust analysis, and so on. 


Finally, the third part presents a List of References related with DSmT published or 
presented along the years since its inception in 2004, chronologically ordered. 


We want to thank all the contributors of this fourth volume for their research works 
and their interests in the development of DSmT. 
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We are grateful as well to other colleagues for encouraging us to edit a new 
volume, for sharing with us several ideas and for their questions and comments on 
DSmT through the years. We thank the International Society of Information Fusion 
(www.isif.org) for diffusing main research works related to information fusion 
(including DSmT) in the international fusion conferences series over the years. 

This book is dedicated to the memory of our good friends and colleagues Dr. Jean- 
Pierre Le Cadre, Prof. Pierre Valin (ISIF president 2006) and Prof. Darko Mušicki 
(ISIF President 2008) who have always been very active in ISIF and in the 
organization of past fusion conferences. We will never forget them. 


Also, Florentin Smarandache is grateful to The University of New Mexico, U.S.A., 
that many times partially sponsored him to attend international conferences, workshops 
and seminars on Information Fusion, and Jean Dezert is grateful to the Department of 
Information Modeling and Processing (DTIM) at the French Aerospace Lab (Office 
National d’Etudes et de Recherches Aérospatiales), Palaiseau, France, for encouraging 
him to carry on this research and for its financial support. 

For the next volume, the authors are pleased to send their articles on DSmT to the 
editors: 

Prof. Florentin Smarandache (fsmarandache@ gmail.com) 
Dr. Jean Dezert (jdezert@gmail.com). 


The Editors. 


Part 1: 
Theoretical advances 
on DSmT 
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Non Bayesian Conditioning and Deconditioning 


Jean Dezert 
Florentin Smarandache 


Abstract—In this paper, we present a Non-Bayesian condition- 
ing rule for belief revision. This rule is truly Non-Bayesian in 
the sense that it doesn’t satisfy the common adopted principle 
that when a prior belief is Bayesian, after conditioning by X, 
Bel(X|X) must be equal to one. Our new conditioning rule for 
belief revision is based on the proportional conf ict redistribution 
rule of combination developed in DSmT (Dezert-Smarandache 
Theory) which abandons Bayes’ conditioning principle. Such 
Non-Bayesian conditioning allows to take into account judiciously 
the level of confict between the prior belief available and 
the conditional evidence. We also introduce the deconditioning 
problem and show that this problem admits a unique solution 
in the case of Bayesian prior; a solution which is not possible 
to obtain when classical Shafer and Bayes conditioning rules are 
used. Several simple examples are also presented to compare 
the results between this new Non-Bayesian conditioning and the 
classical one. 

Keywords: Belief functions, conditioning, deconditioning, 


probability, DST, DSmT, Bayes rule. 


I. INTRODUCTION 


The question of the updating of probabilities and beliefs 
has yielded, and still yields, passionate philosophical and 
mathematical debates [3], [6], [7], [9], [12], [13], [17], [20], 
[22] in the scientifc community and it arises from the 
different interpretations of probabilities. Such question has 
been reinforced by the emergence of the possibility and the 
evidence theories in the eighties [4], [16] for dealing with 
uncertain information. We cannot browse in details here all 
the different authors’ opinions [1], [2], [8], [10], [14], [15] 
on this important question but we suggest the reader to start 
with Dubois & Prade survey [5]. In this paper, we propose a 
true Non-Bayesian rule of combination which doesn’t satisfy 
the well-adopted Bayes principle stating that P(X|X) = 1 
(or Bel(X|X) = 1 when working with belief functions). 
We show that by abandoning such Bayes principle, one can 
take into account more eff ciently in the conditioning process 
the level of the existing confict between the prior evidence 
and the new conditional evidence. We show also that the 
full deconditioning is possible in some specifc cases. Our 
approach is based on belief functions and the Proportional 
Conf ict Redistribution (mainly PCR5) rule of combination 
developed in Dezert-Smarandache Theory (DSmT) framework 
[18]. Why we use PCRS here? Because PCRS is very eff cient 


11 


Originally published as: Dezert J., Smarandache F.- Non Bayesian 
conditioning and deconditioning, in Proc. of International 
Workshop on Belief Functions, Brest, France, April 2-4, 2010, 
and reprinted with permission. 


to combine conficting sources of evidences! and because 
Dempster’s rule often considered as a generalization of Bayes 
tule is actually not deconditionable (see examples in the 
sequel), contrariwise to PCRS, that’s why we utilize PCRS. 
This paper is organized as follows. In section II, we brief y 
recall Dempster’s rule of combination and Shafer’s Condition- 
ing Rule (SCR) proposed in Dempster-Shafer Theory (DST) 
of belief functions [16]. In section HI, we introduce a new 
Non-Bayesian conditioning rule and show its difference with 
respect to SCR. In section IV, we introduce the dual problem, 
called the deconditioning problem. Some examples are given 
in section V with concluding remarks in section VI. 


II. SHAFER’S CONDITIONING RULE 


In DST, a normalized basic belief assignment (bba) m(.) 
is defned as a mapping from the power set 2° of the 
fnite discrete frame of discernment © into [0,1] such that 
m(0) = 0 and So yezo M(X) = 1. Belief and plausibility 
functions are in one-to-one correspondence with m(.) and are 
respectively defned by Bel(X) = J zeze zcx m(Z) and 
PUX) = Voges zaxzo™(Z). They are usually interpreted 
as lower and upper bounds of a unknown measure of subjective 
probability P(.), i.e. Bel(X) < P(X) < PI(X) for any X. In 
DST, the combination of two independent sources of evidence 
characterized by mj (.) and mo(.) is done using Dempster’s 
rule as follows”: 

È xi xe mı(Xı)m2(X2) 

XıNX2=X 


1— Dx, X2€2® mı(Xı)m2(X2) 
XıNX2=0 


mps(X) = (1) 


Shafer’s conditioning rule? (SCR) is obtained as the result 
of Dempster’s combination of the given prior bba mj(.) 
with the conditional evidence, say Y represented by a source 
mə(.) only focused on Y, that is such that mə(Y) = 1. In 
other words, m(X|Y) = mps(X) = (mı © m2)(X) using 
m2(Y) = 1 and where © symbol denotes here Dempster’s 


'Due to space limitation, we do not present, nor justify again PCR5 wrt. 
other rules since this has been widely explained in the literature with many 
examples and discussions, see for example [18], Vol. 2. and our web page. 

assuming that the numerator is not zero (the sources are not in total 
conf ict). 

3also called Dempster’s conditioning by Glenn Shafer in [16]. 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


fusion rule (1). It can be shown [16] that the conditional belief 
and the plausibility are given by’: 


Bel(X|Y) = X maagia UT ERAR 


A 1— Bel, (Y) 
ZE2 
ZOX 
Pl(X NY) o 
PUX|Y) = > mps(4|Y) = Pr) (3) 
AS 0 


When the belief is Bayesian’, i.e. Bel(.|Y) = Pl(.\Y) = 
P(.|Y), SCR reduces to classical conditional probability def- 
inition (Bayes formula), that is P(X|Y) = P(X NY)/P(Y), 
with P(.) = mj,(.). Note that when Y = X and as soon 
as Bel(X) < 1, one always gets from (2), Bel(X|X) = 1 
because Beli(X UY) = Bel (X U X) = Bel (©) = 1. 
For Bayesian belief, this implies P(X|X) = 1 for any X 
such that P;(X) > 0, which we call Bayes principle. Other 
alternatives have been proposed in the literature [8], [15], 
[21], but almost all of them satisfy Bayes principle and they 
are all somehow extensions/generalization of Bayes rule. A 
true Non-Bayesian conditioning (called weak conditioning) 
was however introduced by Planchet in 1989 in [14] but 
it didn’t bring suffcient interest because Bayes principle 
is generally considered as the best solution for probability 
updating based on different arguments for supporting such 
idea. Such considerations didn’t dissuade us to abandon Bayes 
principle and to explore new Non-Bayesian ways for belief 
updating, as Planchet did in nineties. We will show in next 
section why Non-Bayesian conditioning can be interesting. 


III. A NON BAYESIAN CONDITIONING RULE 


Before presenting our Non Bayesian Conditioning Rule, 
it is important to recall briefy the Proportional Conf ict 
Redistribution Rule no. 5 (PCR5) which has been proposed 
as a serious alternative of Dempster’s rule [16] in Dezert- 
Smarandache Theory (DSmT) [18] for dealing with conf icting 
belief functions. In this paper, we assume working in the same 
fusion space as Glenn Shafer, i.e. on the power set 2° of 
the fnite frame of discernment © made of exhaustive and 
exclusive elements. 


A. PCRS rule of combination 


Def nition: Let’s m1(.) and m2(.) be two independent® bba’s, 
then the PCR5 rule of combination is defned as follows 
(see [18], Vol. 2 for details, justif cation and examples) when 
working in power set 2°: mpcrs(0) = 0 and YX € 2° \ {0} 


mpcors(X) = 5 m1(X1)m2(X2)+ 
es 
m4(X)2mo(X2) ma(X)2m1 (Xe) 
S. a + ma) mA m O 
AX 


4Y denotes the complement of Y in the frame ©. 
Sthe focal elements of mı (.|Y) are singletons only. 
6i.e. each source provides its bba independently of the other sources. 


All fractions in (4) having zero denominators are discarded. 
The extension and a variant of (4) (called PCR6) for 
combining s > 2 sources and for working in other fusion 
spaces is presented in details in [18]. Basically, in PCRS the 
partial conficting masses are redistributed proportionally to 
the masses of the elements which are involved in the partial 
confict only, so that the specifcity of the information is 
entirely preserved through this fusion process. It has been 
clearly shown in [18], Vol. 3, chap. 1 that Smets’ rule’ is 
not so useful, nor cogent because it doesn’t respond to new 
information in a global or in a sequential fusion process. 
Indeed, very quickly Smets fusion result commits the full 
of mass of belief to the empty set!!! In applications, some 
ad-hoc numerical techniques must be used to circumvent this 
serious drawback. Such problem doesn’t occur with PCRS 
tule. By construction, other well-known rules like Dubois & 
Prade, or Yager’s rule, and contrariwise to PCRS, increase 
the non-specif city of the result. 


Properties of PCRS: 

e (P0): PCRS rule is not associative, but it is quasi- 
associative (see [18], Vol. 2). 
(P1): PCRS Fusion of two non Bayesian bba’s is a non 
Bayesian bba. 
Example: Consider © = {A, B,C} with Shafer’s model 
and with the two non Bayesian bba’s mj,(.) and ma(.) 
given in Table I. The PCRS fusion result (rounded at the 
fourth decimal) is given in the right column of the Table 
I. One sees that mpcrs(.) in a non Bayesian bba since 
some of its focal elements are not singletons. 


Table I 
PCR5 FUSION OF TWO NON BAYESIAN BBA’S. 
[Focal Elem | mO | mot) TT ecr | 
Fi 0.2 





(P2): PCRS Fusion of a Bayesian bba with a non Bayesian 
bba is a non Bayesian bba in general? 

Example: Consider © = {A, B,C} with Shafer’s model 
and Bayesian and a non Bayesian bba’s m4(.) and m2(.) 
to combine as given in Table II. The PCRS fusion result 
is given in the right column of the Table II. One sees that 
mpcrs(.) is a non Bayesian bba since some of its focal 
elements are not singletons. 

This property is in opposition with Dempster’s rule 
property (see Theorem 3.7 p. 67 in [16]) which states that 
if Bel, is Bayesian and if Bel; and Belz are combinable, 
then Dempster’s rule provides always a Bayesian belief 
function. The result of Dempster’s rule noted mpg(.) for 


Tie. the non normalized Dempster’s rule. 

8In some cases, it happens that Bayesian @ Non-Bayesian = Bayesian. For 
example, with © = {A,B,C}, Shafer’s model, mı (A) = 0.3, mı (B) = 
0.7 and m2(A) = 0.1, m2(B) = 0.2, m2(C) = 0.4 and m2(AUB) = 0.3, 
one gets mpcrs(A) = 0.2162, mpcrs(B) = 0.6134 and mpors(C) = 
0.1704 which is a Bayesian bba. 
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Table II 

PCRS5 FUSION OF BAYESIAN AND NON BAYESIAN BBA’S. 

[_ Focal Elem | miO [To m20 [I 
01 0 


0.2 0.3 
0.7 0.2 


"PCH 
0.0642 
0.1941 
0.6703 
0.0714 


mp 
0.0833 
0.1000 
0.8167 
0 


AUC 





this example is given in Table II for convenience. This is 
the major difference between PCRS and Dempster’s rule, 
not to mention the management of conf icting information 
in the fusion process of course. 
In summary, and using © symbol to denote the generic 
fusion process, one has 

— With Dempster’s rule : 


Bayesian © Non-Bayesian = Bayesian 


— With PCRS rule: 
Bayesian © Non-Bayesian = Non-Bayesian (in general) 


(P3): PCRS Fusion of two Bayesian bba’s is a Bayesian 
bba (see [18], Vol. 2, pp. 43-45 for proof). 


Example: © = {A, B,C} with Shafer’s model and let’s 
consider Bayesian bba’s given in the next Table. The 
result of PCRS fusion rule is given in the right column 
of Table III. One sees that mpcrs(.) is Bayesian since 
its focal elements are singletons of the fusion space 2°. 


Table III 
PCRS5 FUSION OF TWO e BBA’S. 


pe) cm KOTTOM 70 
0.0567 
F 6 0. a 


0.7396 





B. A true Non Bayesian conditioning rule 


Here? we follow the footprints of Glenn Shafer in the sense 
that we consider the conditioning as the result of the fusion 
of any prior mass m;(.) defned on 2° with the bba mə(.) 
focused on the conditional event Y # Q, i.e. me(Y) = 1. 
We however replace Dempster’s rule by the more eff cient!° 
Proportional Conf ict Redistribution rule # 5 (PCR5) given by 
(4) proposed in DSmT [18]. This new conditioning rule is 
not Bayesian and we use the symbol || (parallel) instead of 
classical symbol | to avoid confusion in notations. Let’s give 
the expression of m(X || Y) resulting of the PCR5 fusion of 
any prior bba m1(.) with m2(.) focused on Y. Applying (4): 


m(X || Y) = S17 (X,Y) + So8(X, Y) + 83™(X,Y) (5) 
with 
SSX, Y) XO mi(X1)m2(X2) (6) 
X1,X2€2° 
X{NX_g=X 


°More sophisticated conditioning rules have been proposed in [18], Vol. 2. 

10Tt deals better with partial conficts than other rules unlike Dempster’s 
tule, it does not increase the non-specif city of the result unlike Dubois & 
Prade or Yager’s rule, and it does respond to new information unlike Smets 
tule. 


E Bie ee __ma(X2) 
Dat sy ey a) oe mı(X) + m2(X2) A 
XAX2=0 
oe dn eet __ma(X2) 
pan Sma oe m2(X) + m1 (X2) i 
XnX2=0 


where m2(Y) = 1 for a given Y # 0. 


Since Y is the single focal element of mo(.), the term 


Si™ (X,Y) in (5) is given by ` x e20 mı(Xı), the term 
Xinv=X 

SE (X,Y) equals &(X NY = 0). ee, and the term 
S3(X,Y) can be expressed depending on the value of X 
with respect to the conditioning term Y: 

e If X Æ Y then mo(X 4 Y) = 0 (by defnition), and 

thus 63° (X,Y) = 0. 
e If X = Y then mə(X = Y) = 1 (by defnition), and 


thus S2% (X,Y) => pee’, oe 
Finally, S53% (X, Y) can be written as 
mı(X2) 
Sa (X,Y) =d(X #Y)-0+84(X =Y —— 
MKY ask eV) or) D O 
0 X2€2 
X2NOY = 
mı(X2) 
=d0(X =Y)- —— 
( ) ae 1+mj,(X2) 
XanY=0 
Finally, m(X || Y) for X # @ and Y # 9) are given by 
m(X | ¥)= X mi(X1)+6(XNY = p). ma 
sci 1+m4(X) 
X1€2° 
XiNY=X 
mı(X2) 
+6(X =Y). —— (9 
( ) 2i mang © 
X2€2 
X2NY=0 


m(@ || Y 4 0) = 0 by defnition, since PCR5 fusion doesn’t 
commit mass on the empty set. m(X || 0) is kept undef ned!! 
since it doesn’t make sense to revise a bba by an impossible 
event. Based on the classical def nitions of Bel(.) and Pl(.) 
functions [16], one has: 

2 


Bel(X || Y) = 
ZE2° 


ZOX 


2 


ZE2° 
ZNX #0 


m(Z || Y) (10) 


P(X | Y)= m(Z || Y) (11) 


The ”true” unknown (non Bayesian) conditional subjective 


probability, denoted P(X||Y), must satisfy 
Bel(X || Y) < P(X|IY) < PUXIIY) (12) 


"One could also defne m(@ || Ø) = 1 and m(X # O || Ø) = O which 
however would not be a normal bba. 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


P(X||Y) can be seen as an imprecise probability and used 
within IPT (Imprecise Probability Theory) [23] if necessary, 
or can be approximated from m(.||Y ) using some probabilistic 
transforms, typically the pignistic transform [19] or the DSmP 
transform [18] (Vol.3, Chap. 3). The search for direct close- 
form expressions of Bel(X || Y) and PI(X || Y) from 
Bel,(.) and Pl,(.) appears to be an open diff cult problem. 


IV. DECONDITIONING 


In the previous section we have proposed a new non 
Bayesian conditioning rule based on PCRS. This rule follows 
Shafer’s idea except that we use PCRS instead of Dempster’s 
tule because we have shown the better effciency of PCR5 
to deal with conf icting information w.r.t. other rules. In this 
section, we also show the great benef t of such PCRS rule for 
the deconditioning problem. The belief conditioning problem 
consists in fnding a way to update any prior belief function 
(Bel(.), Pl() or m(.)) with a new information related with the 
(belief of) occurrence in a given conditional proposition of the 
fusion space, say Y, in order to get a new belief function called 
conditional belief function. The deconditioning problem is the 
inverse (dual) problem of conditioning. It consists to retrieve 
the prior belief function from a given posterior/conditional 
belief function. Deconditioning has not been investigated in 
deep so far in the literature (to the knowledge of the authors) 
since is is usually considered as impossible to achieve!?, 
it may present great interest for applications in advanced 
information systems when only a posterior belief is available 
(say provided by an human or an Al-expert system), but for 
some reason we need to compute a new conditioning belief 
based on a different conditional hypothesis. This motivates 
our research for developing deconditioning techniques. Since 
Bel(.), Pl() are in one-to-one correspondence with the basic 
belief assignment (bba) mass m(.), we focus our analysis on 
the deconditioning of the conditional bba. More simply stated, 
we want to see if for any given conditional bba m(.||Y) we 
can compute mı (.) such that m(.||Y) = PC R5(m1(.), ma(.)) 
with m2(Y) = 1 and where PC-R5(mi(.),ma(.)) denotes 
the PCR5 fusion of m,(.) with m2(.). Let’s examine the two 
distinct cases for the deconditiong problem depending on the 
(Bayesian or non-Bayesian) nature of the prior ™m1(.). 

e Case of Bayesian prior m: (.): Let © = {01, 02,..., An}, 

with n > 2, Shafer’s model, where all 0; are singletons. 
Let mı : O + [0,1] be a Bayesian bba/mass. In that case, 
the deconditioning problem admits a unique solution 
and we can always compute mj,(.) from m(.||Y) but 
two distinct cases must be analyzed depending on the 
cardinality of the conditional term Y. 

Case 1: When Y is a singleton, i.e. |Y| = 1. Suppose 
mo(Y) = 1, with Y = 6,,, for jo € {1,2,...,n}, 
where jo is fxed. Since the bba’s mj (.) and ma(.) are 
both Bayesian in this case, m(.||Y) is also a Bayesian 
bba (property P3), therefore m(6;||Y) = ai, where all 
a; € [0,1] with $; a; = 1. How to fnd mj(.) such 


This truly happens when classical Bayes conditioning is used. 


that m/(.||Y) PCR5(m,(.), mo(.)) ? Lets denote 
mi(6;) = zi, where all x; € [0,1] and XY; z; = 1. 
We need to fnd all these z;. We now combine mı(.) 
with m2(.) using PCRS fusion rule. We transfer x;, for 
Vi Æ jo, to 0; and 0j, proportionally with respect to 
their corresponding masses, x; and 1 respectively: 2 
we; ; a? 
=r r zH Perey) 
while aj, = %j +>) 7=1 zH a= l-E 


; Jo 5 
Since we need to fnd all unknowns 7;, i = 





94 
Ti 
Ti 


xil’ 





whence we, = 





and we io = 


Ti 


ie Fi 








n 
EL Fes 

ixjo “* 
Mls Bates TDs 





we need to solve ra = is for i # jo for zi; 
; ran n Ti = pas 
since Qj, = Tjo + del Tit — ajos We get Tj = 
wFJO 
nm Ti i) y 
ajo — diel ety l-i i ti. 


Case 2: When Y is not a io, i.e. |Y| > 1 (Y can 
be a partial or total ignorance). Suppose m2(Y) = 1, 
with Y = Oja U Oj Wieavl bjp» where all Jis J2, NES 
jp are different and they belong to {1,2,...,n}, 2 < 
p < n. We keep the same notations for m(.||Y ) and 
Bayesian mı(.). The set {j1, j2, .--, jp} is denoted J 
for notation convenience. Similarly, using PCR5 rule we 
transfer x;, Vi ¢ J, to x; and to the ignorance Y = 
Oj U...U6;, proportionally with respect to x; and 1 
respectively (as done in case J). So, x; for i ¢ J is found 





from solving the equation -4 = a;, which gives? x; = 
(ait ya? + 4a;)/2; and xj, = aj, for r € {1,2,...,p}. 


Case of Non-Bayesian prior mı(.): 

Unfortunately, when mı (.) is Non-Bayesian, the (PCR5- 
based) deconditioning problem doesn’t admit one unique 
solution in general (see the example 2.1 in the next 
section). But the method used to decondition PCRS when 
my,(.) is Bayesian can be generalized for mı(.) non- 
Bayesian in the following way: 1) We need to know the 
focal elements of mı(.), then we denote the masses of 
these elements by say 21, %2,..., Zn; 2)Then we combine 
using the conjunctive rule m1(.) with m2(Y) = 1, where 
Y can be a singleton or an ignorance; 3) Afterwards, we 
use PCRS rule and we get some results like: f;(21, ..., £n) 
for each element, where i = 1,2,.... Since we know 
the results of PCR5 as m(.||Y) a; for each focal 
element, then we form a system of non-linear equations: 
fi(1, 22,...,2n) = a; and we need to solve it. Such 
systems of equations however can admit several solutions. 
We can select a solution satisfying an additional criterion 
like by example the minimum (or the maximum) of 
specif city depending of the kind of Non-Bayesian prior 
we need to use. 


V. EXAMPLES 
A. Example 1: Conditioning of a Bayesian prior belief 
Let’s consider © = {A,B,C}, Shafer’s model, and the 


prior bba’s mi(.) and m{(.) given in Table IV and the 
conditional evidence Y = AUB. 


The solution x; = (a; — ,/a? + 4a;)/2 must be discarded since it is 
negative and cannot be considered as a mass of belief. 
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Table IV 
BAYESIAN PRIORS (INPUTS). 


0.49 


0.49 
0.02 





The signif cance of having two cases in the Bayesian prior 
case is straighforward. We just want to show that two different 
priors can yield to the same posterior bba with Bayes/SCR rule 
and thus we cannot retrieve these two distinct priors cases from 
the posterior bba. We show that the total deconditioning is 
possible however when using our non-Bayesian conditioning 
rule. SCR and PCR5-based conditioning of m,(.) and m/ (.) 
are given'* in Table V. One sees that SCR of the two distinct 
bba’s mı(.) and m; (.) yield the same posterior/conditional 
bba m/(.|Y) which means that in this very simple Bayesian 
prior case, the deconditioning of m/(./Y) is impossible to 
obtain since at least two solutions!’ for the prior beliefs are 
admissible. The results provided by PCR5-based conditioning 
makes more sense in authors’ point of view since it better takes 
into account the degree of conf icting information in the con- 
ditioning process. One sees that two distinct Bayesian priors 
yield two distinct posterior bba’s with PCRS-based condition- 
ing. If one examines the belief and plausibility functions, one 
gets, using notation A(./Y) = [Bel(./Y), PIC.|Y)], A’C|Y) = 
[Bel’(./Y), PU'C.|Y)], ACY) [Bel(.||Y), Pl(.||Y)] and 
ACY) = [Bel (Y), PUCIYD: 


Table V 
CONDITIONAL BBA’S. 
[_Focal Elem. | mY) | m7Cl¥Y) TT mn | m 
A 05 05 0.4900 


0.0100 
0.5 0.5 
0 0 
0 0 


B 


0.4900 
0.00039215 
0.01960785 


0.0100 
0.48505051 
0.49494949 


Cc 
AUB 





Table VI 
CONDITIONAL LOWER AND UPPER BOUNDS OF CONDITIONAL 
PROBABILITIES 


(0.4900, 0.5096] 
[0.4900, 0.5096] 
(0.0004, 0.0004] 
[0.9996,0.9996] 


[ 0.0100, 0.5050] 

[ 0.0100, 0.5050] 

[0.4850,0.4850] 

[0.5150,0.5150] 

(0.4950, 0.9900} 

[0.4950, 0.9900} 
[1] 


[ 0.4904, 0.5100] 
[ 0.4904, 0.5100] 
U1) 





The interval A(.|Y ) corresponds to lower and upper bounds 
of conditional subjective probabilities P(.|Y) and A(.||Y) 
corresponds to lower and upper bounds of P(.||Y) (similarly 
for A’(./Y) and A’(.||Y)). From the Table VI, one sees that 
the property P2 is verif ed and we get an imprecise conditional 
probability. One sees that contrariwise to SCR (equivalent 
to Bayes rule in this case), one gets Bel(Y||Y) < 1 and 
also PUY ||Y) < 1. A(.|[Y) and A’(.||Y) are very different 
because priors were also very different. This is an appealing 


'4Due to space limitation constraints, the verif cation is left to the reader. 
15 Actually an inf nite number of solutions exists. 


property. If one approximates!® the conditional probability by 
the mid-value of their lower and upper bounds!’, one gets 
values given in Table VII. 


Table VII 
CONDITIONAL APPROXIMATE SUBJECTIVE PROBABILITIES. 


PCy) = PAY) || Proy | rcy) 
0 0 0 





When the conditioning hypothesis supports the prior 
belief (as for mı(.) and me2(.) which are in low conf ict) 
the PCRS-based conditioning reacts as SCR (as Bayes 
rule when dealing with Bayesian priors) and P(X.||Y) is 
very close to P(.|Y). When the prior and the conditional 
evidences are highly conficting (i.e. like m{(.) and mo(.), 
PCRS5S-based conditioning rule is much more prudent than 
Shafer’s rule and that’s why it allows the possibility to have 
P(Y||Y) < 1. Such property doesn’t violate the fundamental 
axioms (nonnegativity, unity and additivity) of Kolmogorov 
axiomatic theory of probabilities and this can be verifed 
easily in our example. In applications, it is much better 
to preserve all available information and to work directly 
with conditional bba’s whenever possible rather than with 
approximate subjective conditional probabilities. 


The deconditioning of the posterior bba’s m(. || Y) given 
in the Table V is done using the principle described in 
section IV (when mı(.) is assumed Bayesian and for case 
2). We denote the unknowns m (A) = zı, mı(B) = 22 
and mı(C) = a3. Since Y = AUB and J = {1,2}, we 
solve the following system of equations (with the constraint 
Ti E [0, 1): zı = ay = 0.49, x2 = ag 0.49 and 
x3/(tz3 +1) = as 0.00039215. Therefore, one gets 
after deconditioning m,(A) = 0.49, mi(B) = 0.49 and 
mı(C) = 0.02. Similarly, the deconditioning of m’(. || Y) 
given in the Table V yields m{(A) = 0.01, m{(B) = 0.01 
and mi (C) = 0.98. 

Note that, contrarywise to Bayes or to Jeffrey’s rules [8], 
[11], [21], it is possible to update the prior opinion about an 
event A even if P (A) = 0 using this Non-Bayesian rule. For 
example, let’s consider © = {A, B, C}, Shafer’s model and 
the prior Bayesian mass m (A) = 0, mı(B) = 0.3 and mı (C) 
= 0.7, ie. Belı(A) = P,;(A) = P I(A) = 0. Assume that the 
conditional evidence is Y = AUB, then one gets with SCR 
m(B|A U B) = 1 and with PCR5-based conditioning m(B || A 
U B) = 0.30, m(A U B || A U B) = 0.41176 and m(C || A 
U B) = 0.28824, which means that P(A|A U 
B) = 0 with SCR/Bayes rule (i.e. no update on A), whereas 
[Bel(A || AUB), PI(A || AU B)] = [0, 0.41176], [Bel(B || 


'6When the lower bound is equal to the upper bound, one gets the exact 
probability value. 

'TMore sophisticated transformations could be used instead as explained in 
[18], Vol. 3. 
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AUB), PUB || AU B)] = [0.30,0.71176] and [Bel(C || 
AUB), PI(C || AU B)] = [0.28823, 0.28823], that is P(A || 
AU B) e€ [0,0.41176]. Typically, if one approximates P(. || 
AUB) by the mid-value of its lower and upper bounds, one 
will obtain P(A || AU B) = 0.20588 (i.e. a true update of 
the prior probability of A), P(B || AU B) = 0.50588 and 
P(C || AUB) = 0.28824. 


B. Example 2: Conditioning of a Non-Bayesian prior belief 


Example 2.1: Lets consider now © = {A, B,C}, Shafer’s 
model, the conditioning hypothesis Y AUB and the 


following Non-Bayesian priors: 
Table VII 


NON-BAYESIAN PRIORS (INPUTS). 





The confict between mj (.) and m2(Y) = 1 and between 
mi (.) and m2(Y) = 1 is 0.10 in both cases. The results of 
the conditioning are given in Table IX. One sees that when 
distinct priors are Non-Bayesian, it can happen that PCRS- 
based conditioning rule yields also the same posterior bba’s. 
This result shows that in general with Non-Bayesian priors the 
PCRS-based deconditioning cannot provide a unique solution, 
unless extra information and/constraints on the prior belief are 
specif ed as shown in the next example. 


Table IX 
CONDITIONAL BBA’S. 


[Fost Bem [ mere) | mC) my) | CTY) 


A 0.222 0.222 0.20 0.20 
B 
Cc 

AUB 


0.333 0.30 0.30 

0 0.01 0.01 
0.445 0.49 0.49 

Example 2.2: Let’s consider now © = {A, B,C, D}, Shafer’s 

model, the conditional evidence Y = C'U D and the posterior 

bba m(. || CUD) given in the right column of the table below: 


0.333 


0 
0.445 





Table X 
CONDITIONAL BBA’S. 


[Foal Gen [ ma) [| ___ pops) | CTA) 


0.0333 


0.1667 





0.8000 





If we assume that the focal elements of the prior bba mı (.) 
are the same as for the posterior bba m(. || CUD), then with 
such extra assumption, the deconditioning problem admits a 
unique solution which is obtained by solving the system of 








three equations according to Table X; that is oe = 0.0333, 
2 
whence zı ~ 0.2; i = 0.1667, whence z2 ~ 0.5; £3 + 


A + Fer = 0.8000; whence x; œ% 0.3. Therefore, the 
deconditioning of m(. || C U D) provides the unique Non- 
Bayesian solution mı (A) = 0.2, mi(B) = 0.5 and mı (C U 
D) =0.3. 
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VI. CONCLUSIONS 


In this paper, we have proposed a new Non-Bayesian con- 
ditioning rule (denoted || ) based on the Proportional Conf ict 
Redistribution (PCR) rule of combination developed in DSmT 
framework. This new conditioning rule offers the advantage to 
take fully into account the level of conf ict between the prior 
and the conditional evidences for updating belief functions. It 
is truly Non-Bayesian since it doesn’t satisfy Bayes principle 
because it allows P(X || X) or Bel(X || X) to be less than 
one. We have also shown that this approach allows to solve 
the deconditioning (dual) problem for the class of Bayesian 
priors. More investigations on the deconditioning problem of 
Non-Bayesian priors need to be done and comparisons of 
this new rule with respect to the main alternatives of Bayes 
tule proposed in the literature (typically Jeffrey’s rule and its 
extensions, Planchet’s rule, etc) will be presented in details in 
a forthcoming publication. 
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Multi-criteria decision making based 
on DSmT-AHP 


Jean Dezert 
Jean-Marc Tacnet 
Mireille Batton-Hubert 
Florentin Smarandache 


Abstract—In this paper, we present an extension of the multi- 
criteria decision making based on the Analytic Hierarchy Process 
(AHP) which incorporates uncertain knowledge matrices for 
generating basic belief assignments (bba’s). The combination of 
priority vectors corresponding to bba’s related to each (sub)- 
criterion is performed using the Proportional Conf ict Redistribu- 
tion rule no. 5 proposed in Dezert-Smarandache Theory (DSmT) 
of plausible and paradoxical reasoning. The method presented 
here, called DSmT-AHP, is illustrated on very simple examples. 


Keywords: Analytic Hierarchy Process, AHP, DSmT, In- 
formation Fusion, Decision Making, Multi-Criteria. 


I. INTRODUCTION 


The Multi-criteria decision-making (MCDM) problem con- 
cerns the elucidation of the level of preferences of decision 
alternatives through judgments made over a number of criteria 
[6]. At the Decision-maker (DM) level, a useful method for 
solving MCDM problem must take into account opinions 
made under uncertainty and based on distinct criteria with 
different importances. The diff culty of the problem increases 
if we consider a group decision-making (GDM) problem 
involving a panel of decision-makers. Several attempts have 
been proposed in the literature to solve the MCGDM problem. 
Among the interesting solutions developed, one must cite 
the works made by Beynon [3]-[6]. This author developed a 
method called DS/AHP which extended the Analytic Hierar- 
chy Process (AHP) method of Saaty [15]-[17] with Dempster- 
Shafer Theory (DST) [23] of belief functions to take into 
account uncertainty and to manage the conficts between 
experts opinions within a hierarchical model approach. In this 
paper, we propose to follow Beynon’s approach, but instead 
of using DST, we investigate the possibility to use Dezert- 
Smarandache Theory (DSmT) of plausible and paradoxical 
reasoning developed since 2002 for overcoming DST limita- 
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tions! [24]. This new approach will be referred as DSmT-AHP 
method in the sequel. DSmT allows to manage eff ciently the 
fusion of quantitative (or qualitative) uncertain and possibly 
highly conficting sources of evidences and proposes new 
methods for belief conditioning and deconditioning as well [7]. 
DSmT has been successfully applied in several f elds of appli- 
cations (in defense, medicine, satellite surveillance, biometrics, 
image processing, etc). In section I], we brief y introduce the 
principle of the AHP developed by Saaty. In section III, we 
recall the basis of DSmT and its main rule of combination, 
called PCRS (Proportional Conf ict Redistribution rule # 5). 
In section IV, we present the DSmT-AHP method for solving 
the MCDM problem. The extension of DSmT-AHP method 
for solving MCGDM problem is then introduced in section V. 
Conclusions are given in Section VI. 


II. THE ANALYTIC HIERARCHY PROCESS (AHP) 


The Analytic Hierarchy Process (AHP) is a structured 
technique developed by Saaty in [8], [15], [16] based on 
mathematics and psychology for dealing with complex de- 
cisions. AHP and its refnements are used around the world 
in many decision situations (government, industry, education, 
healthcare, etc.). It helps the DM to fnd the decision that best 
suits his/her needs and his/her understanding of the problem. 


'A presentation of these limitations with a discussion is done in Chap 1 
of [24], Vol. 3. It is shown clearly that the logical refnement proposed by 
some authors doesn’t bring new insights with respect to what is done when 
working directly on the super-power set (i.e. on the minimal refned frame 
satisfying Shafer’s model). There is no necessity to work with a ref ned frame 
in DSmT framework which is very attractive in some real-life problems where 
the elements of the refned frame do not have any (physical) sense/meaning 
or are just impossible to clearly determine physically (as a simple example, 
if Mary and Paul have possibly committed a crime alone or together, there 
is no way to refne these two persons into three fner exclusive physical 
elements satisfying Shafer’s model). Aside the possibility to deal with different 
underlying models of the frame, it is worth to note that PCRS or PCR6 rules 
provide a better ability than the other rules to deal eff ciently with highly 
conf icting sources of evidences as shown in all felds of applications where 
they have been tested so far. 
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AHP provides a comprehensive and rational framework for 
structuring a decision problem, for representing and quantify- 
ing its elements, for relating those elements to overall goals, 
and for evaluating alternative solutions. The basic idea of 
AHP is to decompose the decision problem into a hierarchy 
of more easily comprehended sub-problems, each of which 
can be analyzed independently. Once the hierarchy is built, 
the DM evaluates the various elements of the hierarchy by 
comparing them to one another two at a time [21]. In making 
the comparisons, the DM can use both objective information 
about the elements as well as subjective opinions about the 
elements’ relative meaning and importance. The AHP converts 
these evaluations to numerical values that are processed and 
compared over the entire range of the problem. A numerical 
weight or priority is derived for each element of the hierarchy, 
allowing diverse and often incommensurable elements to be 
compared to one another in a rational and consistent way. This 
is the main advantage of AHP with respect to other decision 
making techniques. At its fnal step, numerical priorities are 
calculated for each of the decision alternatives. These num- 
bers represent the alternatives’ relative ability to achieve the 
decision goal. The AHP method can be summarized as [19]: 
1) Model the problem as a hierarchy containing the decision 
goal, the alternatives for reaching it, and the criteria for 
evaluating the alternatives. 

2) Establish priorities among the elements of the hierarchy by 
making a series of judgments based on pairwise comparisons 
of the elements. 

3) Check the consistency of the judgments and eventually 
revise the comparison matrices by reasking the experts when 
the consistency in judgments is too low. 

4) Synthesize these judgments to yield a set of overall priori- 
ties for the hierarchy. 

5) Come to a f nal decision based on the results of this process. 
Example 1: According to his/her own preferences and using 
the Saaty’s 1-9 ordinal scale, a DM wants to buy a car among 
four available models belonging to the set © = {A, B,C, D}. 
To simplify the example, we assume that the objective of DM 
is to select one of these cars based only on three criteria 
(C1=Fuel economy, C2=Reliability and C3=Style). According 
to his/her own preferences, the DM ranks the different criteria 
pairwise as follows: | - Reliability is 3 times as important as 
fuel economy, 2 - Fuel economy is 4 times as important as 
style, 3 - Reliability is 5 times as important as style, which 
means that the DM thinks that Reliability criteria (C2) is the 
most important criteria, followed by fuel economy (C1) and 
style is the least important criteria’. The relative importance 
of one criterion over another can be expressed using pairwise 
comparison matrix (also called knowledge matrix) as follows: 


1/1 1/3 4/1 1.0000 0.3333 4.0000 
M = ļ|3/1 1/1 5/1ı| & [3.0000 1.0000 5.0000 
1/4 1/5 1/1 0.2500 0.2000 1.0000 


where the element m,; of the matrix M indicates the relative 
importance of criteria Ci with respect to the criteria Cj. 


>The relationships between preferences given by a DM may not be transitive 
as shown in this example, nevertheless one has to deal with these inputs even 
in such situations. 
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In this example, m 3 = 4/1 indicates that the criteria Cl 
(Fuel economy) is four times as important as the criteria 
C3 (Style) for the DM, etc. From this pairwise matrix, 
Saaty demonstrated that the ranking of the priorities of the 
criteria can be obtained from the normalized eigenvector’, 
denoted w, associated with the principal eigenvalue of the 
matrix, denoted A. In this example, one has A = 3.0857 and 
w = [0.2797 0.6267 0.0936]’ which shows that C2 criterion 
(reliability) is the most important criterion with the weight 
0.6267, then the fuel economy criterion C1 is the second most 
important criterion with weight 0.2797, and f nally C3 criterion 
(Style) is the least important criterion with weight 0.0936 for 
the DM. A similar ranking procedure can be used to fnd the 
relative weights of each car A, B, C or D with respect to 
each criterion C1, C2 and C3 based on given DM preferences, 
hence one will get three new normalized eigenvectors denoted 
w(C1), w(C2) and w(C3). By example, if one has the 
following normalized vectors 


0.2500 


[w(C1) w(C2) w(C3)] = eats 


0.4733 
0.0611 
0.1832 
0.2824 


0.0565 


0.1087 0.3871 


| 
0.4435 
then the solution of the MCDM problem (here the selec- 
tion of the ”best” car according to the DM multicriteria 
preferences) is fnally obtained by multiplying the matrix 
[w(C1) w(C'2) w(C3] by the criteria ranking vector w. For 


this example, one will get: 
0.3771 
0.1163 
0.2630 
i 


0.1129 0.2797 

a p 
Based on this result, the car A which has the most important 
weight (0.3771) will be selected by the DM. The costs could 
also be included in AHP by taking into account the benef t 
to cost ratios which will allow to chose alternative with 
lowest cost and highest beneft. For example, let’s suppose 
that the cost of car A is 21000 euros, the cost of car B is 
13000 euros, the cost of car C is 12000 euros and the cost 
of car D is 18000 euros, then the normalized cost vector 
is [0.3281 0.2031 0.1875 0.2812)’, so that the benef t-cost 
ratios are now [0.3771/0.3281 = 1.1492 0.1163/0.2031 = 
0.5724 0.2630/0.1875 = 1.4026 0.2436/0.2812 = 0.8663)’. 
Taking into account now the cost of vehicles, now the best 
solution for the DM is to choose the car C since it offers the 
highest benef t-cost ratio. 


0.2500 
0.1304 
0.5109 
0.1087 


0.4733 
0.0611 
0.1832 
0.2824 


In this paper we do not focus on the rank reversal problem of 
AHP as discussed in [9], [10], [13], [18], [22], but we propose 
an extension of AHP using aggregation method developed 
in DSmT framework, able to make a difference between 
importance of criteria, uncertainty related to the evaluations 
of criteria and reliability of the different sources. 


3Note that if the relationships on the criteria is transitive, then we can 
easily construct the normalized vector of priorities from a system of algebraic 
equations, without employing Saaty’s matrix approach. For example if in the 
previous example one assumes* M23 = 12/1 and M32 = 1/12 instead of 
5/1 and 1/5, then the normalized weighting vector will be directly obtained 
as w = [4/17 12/17 1/17)’. 
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III. BASICS OF DSMT 
Let © = {01,62,:--,4n} be a fnite set of n elements 
assumed to be exhaustive. © corresponds to the frame of 
discernment of the problem under consideration. In general, 
we assume that elements of © are non exclusive in order to 
deal with vague/fuzzy and relative concepts [24], Vol. 2. This 
is the so-called free-DSm model. In DSmT, there is no need 
to work on a refned frame consisting in a discrete f nite set 
of exclusive and exhaustive hypotheses because DSm rules 
of combination work for any models of the frame. The hyper- 
power set D® is def ned as the set of all propositions built from 
elements of © with U and N, see [24], Vol. 1 for examples. 
A (quantitative) basic belief assignment (bba) expressing the 
belief committed to the elements of D® by a given source 
is a mapping m/(-): DE — [0,1] such that: m(Ø) = 0 and 
X aepe m(A) = 1. Elements A € D? having m(A) > 0 are 
called focal elements of m(.). The credibility and plausibility 
functions are def ned in almost® the same manner as in DST 
[23]. In DSmT, the Proportional Conf ict Redistribution Rule 
no. 5 (PCRS) is used generally to combine bba’s. PCR5 
transfers the conficting mass only to the elements involved 
in the confict and proportionally to their individual masses, 
so that the specif city of the information is entirely preserved 
in this fusion process. For example: consider two bba’s m1 (.) 
and m2(.), AN B = f for the model of O, and mı (A) = 0.6 
and m2(B) = 0.3. With PCRS the partial conf icting mass 
m,(A)m2(B) = 0.6 - 0.3 = 0.18 is redistributed to A and 
B only with respect to the following proportions respectively: 
xa = 0.12 and xp = 0.06 because 
TA  @B _ mj(A)m2(B) 
mi(A) m(B) m(A)+m(B) 


0.18 
= — = 0.2 
0.9 








In this paper, we work in the power set 2° since most of read- 
ers are usually already familiar with this fusion space. Let’s 
mı(.) and m2(.) be two independent’ bba’s, then the PCR5 
rule is def ned as follows (see [24], Vol. 2 for full justif cation 
and examples): mpcrs(0) = 0 and VX € 2° \ {0} 


mpcrs(X)= X mi(X1)me(X2)+ 


eens ON mo(X)?m1(X2) (1) 
xen MHA) + mG) maX) + mi (Xa) 


X2NX=0 


where all denominators in (1) are different from zero. If a 
denominator is zero, that fraction is discarded. All proposi- 
tions/sets are in a canonical form. A variant of (1), called 
PCR6, for combining s > 2 sources and for working in 
other fusion spaces (hyper-power sets or super power-sets) is 
presented in [24]. Additional properties of PCR5 can be found 
in [7]. Extension of PCRS for combining qualitative bba’s can 
be found in [24], Vol. 2 & 3. 


5referred as Shafer’s model in the literature. 

We just replace 2© by D® in the def nitions of credibility and plausibility 
functions. 

Tie. each source provides its bba independently of the other sources. 


IV. DSMT-AHP FOR SOLVING MCDM 


DSmT-AHP aimed to perform a similar purpose as AHP 
[15], [16], SMART [28] or DS/AHP [2], [4], etc. that is to fnd 
the preferences rankings of the decision alternatives (DA), or 
groups of DA. DSmT-AHP approach consists in three steps: 


e Step 1: We extend the construction of the matrix for 
taking into account the partial uncertainty (disjunctions) 
between possible alternatives. If no comparison is avail- 
able between elements, then the corresponding elements 
in the matrix is zero. Each bba related to each (sub-) 
criterion is the normalized eigenvector associated with the 
largest eigenvalue of the ’uncertain” knowledge matrix 
(as done in standard AHP approach). 

Step 2: We use the DSmT fusion rules, typically the 
PCRS rule, to combine bba’s drawn from step 1 to get a 
fnal MCDM priority ranking. This fusion step must take 
into account the different importances (if any) of criteria 
as it will be explained in the sequel. 

Step 3: Decision-making can be done based either on the 
maximum of belief, or on the maximum of the plausibility 
of Decision alternatives (DA), as well as on the maximum 
of the approximate subjective probability of DA obtained 
by different probabilistic transformations. 


Example 2: Let’s consider now a set of three cars O 
{A, B,C} and the criteria C1=Fuel Economy, C2=Reliability. 
Let’s assume that with respect to each criterion the following 
*uncertain” knowledge matrices are given: 


A BUC iS) 
Ea A 1 0 1/3 
M(C1) a | Buc 0 1 2 | 
e 3 1/2 i 
A B AUC BUC 
A T 2 4 
M(C2) => | B 1/2 t 1/2 1/5 | 
AUC | 1/4 2 1 (0) 
Buc |1/3 5 (0) 1 


Step 1: (bba’s generation) Applying AHP method, one gets the 
following priority vectors w(C1) œ~ [0.0889 0.5337 0.3774]' 
and w(C2) [0.5002 0.1208 0.1222 0.2568)’ which are 
identifed with the bba’s mci(.) and mco2(.) as follows: 
mo1(A) = 0.0889, mcı(B U C) = 0.5337, moi(A U BU 
C) = 0.3774 and mc2(A) = 0.5002, mc2(B) = 0.1208, 
mo2(AUC) = 0.1222 and mo2(B UC) = 0.2568. 

Step 2: (Fusion) When the two criteria have the same full 
importance in the hierarchy they are fused with one of the 
classical fusion rules. In [4] Beynon proposed to use Demp- 
ster’s rule. Here we propose to use the PCRS fusion rule since 
it is known to have a better ability to deal effciently with 
possibly highly conf icting sources of evidences [24], Vol. 2. 
With PCRS, one gets: 


Elem. of 22 


w 
~ 





0 
0.5002 


0.3837 
0 0.1162 
0.1208 0 
0 0.0652 
0.0461 
0.3887 
0 


0.1222 
0.2568 
AUBUC 


Step 3: (Decision-making) A fnal decision based on 
mpcrs(.) must be taken. Usually, the decision-maker (DM) 
is concerned with a single choice among the elements of ©. 
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Many decision-making approaches are possible depending on 
the risk the DM is ready to take. A pessimistic DM will 
choose the singleton of © giving the maximum of credibility 
whereas an optimistic DM will choose the element having the 
maximum of plausibility. A fair attitude consists usually in 
choosing the maximum of approximate subjective probability 
of elements of ©. The result however is very dependent on 
the probabilistic transformation (Pignistic, DSmP, Sudano’s, 
etc) [24], Vol. 2. Below are the values of the credibility, the 
pignistic probability and the plausibility of A, B and C: 


Elem. of © Bel(.) 


0.3837 
0.1162 
0.0652 


BetP(.) 
0.4068 
0.3105 
0.2826 


PU.) 
0.4298 
0.5049 
0.5000 





A 
B 
Cc 


The car A will be preferred with the pessimistic or pignistic 
attitudes, whereas the car B will be preferred if an optimistic 
attitude is adopted since one has PI(B) > PI(C) > PI(A). 

The MCDM problem deals with several criteria having 
different importances and the classical fusion rules cannot 
be applied directly as in step 2. In AHP, the fusion is done 
from the product of the bba’s matrix with the weighting 
vector of criteria. Such AHP fusion is nothing but a simple 
componentwise weighted average of bba’s and it doesn’t 
actually process eff ciently the conf icting information between 
the sources. It doesn’t preserve the neutrality of a full ignorant 
source in the fusion. To palliate these problems, we propose 
a solution for combining sources of different importances in 
the framework of DSmT and DST. 

Before going further, it is essential to explain the difference 
between the importance and the reliability of a source of 
evidence. The reliability is an objective property of a source, 
whereas the importance of a source is a subjective character- 
istic expressed by the fusion system designer. The reliability 
of a source represents its ability to provide the correct as- 
sessment/solution of the given problem. It is characterized by 
a discounting reliability factor, usually denoted a in [0,1], 
which should be estimated from statistics when available, 
or by other techniques [11]. The reliability can be context- 
dependent. By convention, we usually take œ = 1 when the 
source is fully reliable and a 0 if the source is totally 
unreliable. The reliability of a source is usually taken into 
account with Shafer’s discounting method [23] def ned by: 


The importance of a source is not the same as its reliability 
and it can be characterized by an importance factor, denoted ( 
in [0,1] which represents somehow the weight of importance 
granted to the source by the fusion system designer. The choice 
of @ is usually not related with the reliability of the source 
and can be chosen to any value in [0,1] by the designer 
for his/her own reason. By convention, the fusion system 
designer will take 6 = 1 when he/she wants to grant the 
maximal importance of the source in the fusion process, and 
will take 6 = 0 if no importance at all is granted to this 
source in the fusion process. The fusion designer must be able 
to deal with importance factors in a different way than with 


Me(X)=a-m(X), for X #0 


Mo(@) = a-m(O) + (1 — a) (2) 
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reliability factors since they correspond to distinct properties 
associated with a source of information. The importance of 
a source is particularly crucial in hierarchical multi-criteria 
decision making problems, specially in the AHP [16], [20]. 
That’s why it is primordial to show how the importance can 
be effciently managed in evidential reasoning approaches. 
The main question we are concerned here is how to deal 
with different importances of sources in the fusion process in 
such a way that a clear distinction is made/preserved between 
reliability and importance? Our preliminary investigations for 
the search of the solution of this problem were based on the 
self/auto-combination of the sources. But such approach is 
very disputable and cannot be used satisfactorily in practice 
whatever the fusion rule is adopted because it can be easily 
shown that the auto-confict tends quickly to 1 after several 
auto-fusions [11]. Actually a better approach can be used for 
taking into account the importances of the sources and can 
be considered as the dual of Shafer’s discounting approach 
for reliabilities of sources. The idea was originally introduced 
brief y by Tacnet in [24], Vol.3, Chap. 23, p. 613. It consists 
to defne the importance discounting with respect to the 
empty set rather than the total ignorance © (as done with 
Shafer’s discounting). Such new discounting deals easily with 
sources of different importances and is very simple to use. 
Mathematically, we defne the importance discounting of a 
source m/(.) having the importance factor 8 in [0, 1] by: 
mg(X)=6B-m(X), for X #0 


m = -m(0) + 0-8) 


Here we allow to deal with non-normal bba since mg(@) > 0 
as suggested by Smets in [26]. This new discounting pre- 
serves the specif city of the primary information since all 
focal elements are discounted with same importance factor. 
Here we use the positive mass of the empty set as an 
intermediate/preliminary step of the fusion process. Clearly 
when 3 = 1 is chosen by the fusion designer, it will mean 
that the source must take its full importance in the fusion 
process and so the original bba m(.) is kept unchanged. 
If the fusion designer takes G = 0, one will deal with 
mgl) 1 which is interpreted as a fully non important 
source. m(Ø) > 0 is not interpreted as the mass committed 
to some conficting information (classical interpretation), nor 
as the mass committed to unknown elements when working 
with the open-world assumption (Smets interpretation), but 
only as the mass of the discounted importance of a source in 
this particular context. Based on this discounting, one adapts 
PCRS (or PCR6) rule for N > 2 discounted bba’s mg,i(.), 
i = 1,2, ... N by considering the following extension, denoted 
PCR5g, defned by: VX € 2° 


mporsy(X)= XO m(Xi)m(X2)+ 


6) 


mı (X) m2(Xə) ma(X)?mı (X2) 
Ra Him (X) F mA) ETEN (4) 
X2NX=0 
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A similar extension can be done for PCRS and PCR6 formulas 
for N > 2 sources given in [24], Vol. 2. A detailed presenta- 
tion of this technique with several examples will appear in [25] 
and thus it is not reported here. The difference between eqs. 
(1) and (4) is that mpcrs(0) = 0 whereas mpcrs, (0) > 0. 
Since we usually work with normal bba’s for decision making 
support, the combined bba will be normalized. In the AHP 
context, the importance factors correspond to the components 
of the normalized eigenvector w. 
Example 3: Take back example 2 assume that C2 (the relia- 
bility) is three times more important than C1 (fuel economy) 
so that the knowledge matrix is given by: 

M= [sn 


1/3] ~ [1.0000 
od 1.0000 


0.3333 
1/1 3.0000 


Its normalized principal eigenvector is w = [0.2500 0.7500)’ 
and indicates that C2 is three times more important than Cl 
as expressed in the prior DM preferences for ranking criteria. 
w = [wi w2ļ' can also be obtained directly by solving the 
algebraic system of equations w2 = 3w, and wı + w2 = 1 
with w1, w2 € [0,1]. If we apply the importance discounting 
with 3; = wı = 0.25 and G2 = w2 = 0.75, one gets the 
following discounted bba’s 


Elem. of 29 


™2,020) 





With the PCR5g fusion of the sources mg,,c1(.) and 
™M65,C2(.), one gets the results in the table. For decision- 
making support, one prefers to work with normal bba’s. 
Therefore mpcors,(.) is normalized by redistributing back 
MPcR5_(9) proportionally to the masses of other focal el- 
ements as shown in the right column of the next table. 


mre, | PERRO 





If all sources have the same full importances (i.e. all @;=1), 
then mpcrsy(.) = Mpcrs(.) which is normal because in 
such case mg,=1,ci(.) = moi(.). From mpaRe4(.) one 
can easily compute the credibility, pignistic probability or 
plausibility of each element of © for decision-making. In this 
example one gets: 


Elem. of © Bel(.) 


0.5213 


BetP(.) 
0.5741 
0.2126 
0.2134 


PU.) 
0.6331 
0.3963 
0.3974 





0.0351 
0.0355 


If the classical AHP ”fusion” method (i.e. weighted arithmetic 
mean) is used directly with bba’s mci(.) and mco(.), one 
gets: 


0 0 
0.5002 0.3974 
0 0 
0.1208 
0 
0.1222 
0.2568 
0 


= 0.25] __ | 0.0906 
mauP(.) = x KA = 

0.0917 
0.3260 
0.0944 


21 


which would have provided the following result for decision- 


making 

Elem. of © BetP(.) 
0.5200 
0.2398 
0.2403 


PU.) 
0.6741 
0.5110 
0.5121 





A 
B 
Cc 


In this very simple example, one sees that the importance 
discounting technique coupled with PCR5-based fusion rule 
(what we call the DSmT-AHP approach) will suggest, as with 
classical AHP, to choose the alternative A since the car A has 
a bigger credibility (as well as a bigger pignistic probability 
and plausibility) than cars B or C. It is however worth to 
note that the values of Bel(.), BetP(.) and Pl(.) obtained by 
both methods are slightly different. The difference in results 
can have a strong impact in practice in the fnal result for 
example if the costs of vehicles have also to be included in 
the fnal decision (as explained at the end of the example 1). 
Note also that the uncertainties U(X) = PI(X) — Bel(X) 
of alternatives X = A,B,C have been seriously diminished 
when using DSmT-AHP with respect to what we obtain with 
classical AHP as seen in the following table. The uncertainty 
reduction is a nice expected property specially important for 
decision-making support. 


Elem. of © 


U(.) with AHP 


U(.) with DSmT-AHP 
0.1118 





0.2767 
0.5110 
0.5121 


0.3612 
0.3619 


Quy 


Important remark: If Dempster’s rule is used instead of 
PCR5 9 rule, one gets the following results when compar- 
ing the fusion of mci(.) with mceo(.) (i.e. without im- 
portance discounting) with the fusion of mg, =w,=0.25,c1(.) 
with mg,=w.=0.75,C2(.) (i.e. with importance discounting of 
criteria Cl and C2): 


On EEE 
0 0 


0.3588 
0.0908 


0.3588 

0.0908 

0.0642 

0.0918 

0.0650 

0.3294 
0 


0.0642 
0.0918 
0.0649 
0.3294 





Clearly, Dempster’s rule cannot deal properly with impor- 
tance discounted bba’s as we have proposed in this work just 
because the importance discounting technique preserves the 
specif city of the primary information and thus Dempster’s 
tule does not make a difference in results when combining 
either mcı(.) with mco(.) or when combining mg, 41,01(.) 
with mg,41,C2(.) due to the way of processing of the total 
conf icting mass of belief. PRC5S deals more eff ciently with 
importance discounted bba’s as we have shown in this exam- 
ple. So it is not surprising that such discounting technique 
has never been proposed and used in DST framework and this 
explains why only the classical Shafer’s discounting technique 
(the reliability discounting) is generally adopted. By using 
Dempster’s rule, the fusion designer has no other choice 
but to consider importance and reliability as same notions ! 
The DSmT framework with PCR5 (or PCR6) rule and the 
importance discounting technique proposed here provides an 
interesting and simple solution for the fusion of sources with 
different importances which makes a clear distinction between 
importances and reliabilities of sources. 
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V. DSMT-AHP FOR SOLVING MCGDM 


Previously, a new approach mixing AHP with DSmT solv- 
ing MCDM problem has been presented. In many practical 
situations however, the decision must be taken by a group 
of n > 1 Decision Makers (GDM), denoted GDM = 
{DM;,i = 1,2,...,n}, rather than a single DM, and from 
the Multi-Criteria preference rankings of the DM,’s. The 
importance (inf uence) of each member of the GDM is usually 
non-equivalent [1] and the importance of each DM of the 
GDM must be effciently taken into account in the fnal 
decision-making process. Let’s denote by mpw,(.) the re- 
sult of DSmT-AHP approach (see section IV) related with 
DM; € GDM. The MCGDM problem consists in combining 
all opinions/preferences rankings mpw,(.), i = 1,...,n 
with their own (possibly different) importances. When all 
DM,’s have equal importance, the classical fusion rules? © 
for combining mpm, (.) can be directly used to get the fnal 
result mucapMm(.) = [mpm PMDM ®... ®@mpm, |(.); If 
the DM,’s have different importance weights w;, the DSmT- 
AHP approach can also be used at the GDM fusion level 
using the importance discounting approach presented here. The 
result for group decision-making is given by the PCR5g fusion 
of ma;,pm,(.), with 6; = w; and then the result must be 
normalized for decision making support. In [6], Beynon used 
the classical discounting technique [23] to readjust mpm, (.) 
with w;’s and he identifed the importance factors with the 
reliability factors. In our opinions, this is disputable since 
importance of a DM; is not necessarily related with its 
reliability but rather with the importance in the problem of the 
choice of his/her Multi-Criteria to establish his/her ranking, or 
it can come from other (political, hierarchical, etc.) reasons. 
In our new approach, we make a clear distinction between 
notions of importance and reliability and both notions can be 
easily taken into account [25] with DSmT-AHP for solving 
MCGDM problems, i.e. we can use the classical discounting 
technique for taking into the reliabilities of the sources, and 
use the importance discounting proposed here for dealing with 
the importances of sources. 


VI. CONCLUSIONS AND PERSPECTIVES 


In this paper, we have presented a new method for Multi- 
Criteria Decision-Making (MCDM) and Multi-Criteria Group 
Decision-Making (MCGDM) based on the combination of 
AHP method developed by Saaty and DSmT. The AHP 
method allows to build bba’s from DM preferences of solutions 
which are established with respect to several criteria. The 
DSmT allows to aggregate effciently the (possibly highly 
conf icting) bba’s based on each criterion. This DSmT-AHP 
method allows to take into account also the different impor- 
tances of the criteria and/or of the different members of the 
decision-makers group. The application of this DSmT-AHP 
approach for the prevention of natural hazards in mountains is 
currently under progress, see [24], Vol.3, Chap. 23, and [27]. 


Stypically the PCR5 or PCR6 rules, or eventually Dempster’s rule if the 
conf ict between DM;’s is low. 
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Abstract-In theories of evidence, several methods have been 
proposed to combine a group of basic belief assignments al- 
together at a given time. However, in some applications in 
defense or in robotics the evidences from different sources are 
acquired only sequentially and must be processed in real-time 
and the combination result needs to be updated the most recent 
information. An approach for combining sequentially unreliable 
sources of evidence is presented in this paper. The sources of 
evidence are not considered as equi-reliable in the combination 
process, and no prior knowledge on their reliability is required. 
The reliability of each source is evaluated on the fly by a distance 
measure, which characterizes the variation between one source 
of evidence with respect to the others. If the source is considered 
as unreliable, then its evidence is discounted before entering in 
the fusion process. Dempster’s rule of combination and its main 
alternatives including Yager’s rule, Dubois and Prade rule, and 
PCRS are adapted to work under different conditions. In this 
paper, we propose to select the most adapted combination rule 
according to the value of conflicting belief before combining the 
evidence. The last part of this paper is devoted to a numerical 
example to illustrate the interest of this approach. 

Keywords: evidence theory, combination rule, evidence 
distance, conflicting belief. 


I. INTRODUCTION 


Evidence theories! are widely applied in the field of infor- 


mation fusion. A particular attention has been focused on how 
to efficiently combine sources of evidence altogether at the 
same time (static approach), and many rules aside Dempster’s 
rule have been proposed [1], [2], [6], [9]. In many applications 
however, the evidences from different sources are acquired 
sequentially by different sensors or human experts and the 
belief updating and decision-making need to be taken in real- 
time which requires a sequential/dynamic approach rather than 
a static approach of the fusion problem. 

Usually the evidences arising from different independent 
sources are often considered equally reliable in the combina- 
tion process, when the prior knowledge about the reliability 
of each source is unknown. However, all the sources of 
evidence to be combined can have different reliabilities in real 
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applications. If the sources of evidence are considered as equi- 
reliable, the unreliable ones may bring a very bad influence 
on combination result, and even leads to inconsistent results 
and wrong decisions. Thus, the reliability of each source must 
be taken account in the fusion process as best as possible 
to provide a useful and unbiased result. In this work, we 
propose to evaluate on the fly the relability of the sources to 
combine based on an evidential distance/reliability measure. 
From this reliability measure, one can discount accordingly 
the unreliable sources before applying a rule of combination 
of basic belief assignments (bba’s). 

Many rules, like Dempster’s rule [7] and its alternatives can 
be used to combine sources of evidences expressed by bba’s 
and they all have their drawbacks and advantages (see [8], 
Vol. 1, for a detailed presentation). Dempster’s rule, is usually 
considered well adapted for combining the evidences in low 
conflict situations and it requires acceptable complexity when 
the dimension of the frame of discernment is not too large. 
Dempster’s rule however provides counter-intuitive behaviors 
when the sources evidences become highly conflicting. To 
palliate this drawback, several interesting alternatives have 
been proposed when Dempster’s rule doesn’t work well, 
mainly: Yager’s rule [9], Dubois and Prade rule (DP rule) 
[2], and PCRS (proportional conflict redistribution rule no 5) 
[8] developed in DSmT framework. The difference among 
Dempster’s rule and its main alternatives mainly lies in the 
distribution of the conflicting belief me (Ø) which is generally 
used to characterize the total amount of conflict [4] between 
sources. In this paper, we propose to select the proper rule 
of combination based on the value of the total degree of 
conflict me (H). The last part of this paper presents a numerical 
example to show how the approach of sequential adaptive 
combination of unreliable sources of evidence works. 


II. PRELIMINARIES 


A. Basics of Dempster-Shafer theory (DST) 


DST [7] is developed in Shafer’s model. In this model, 
a fixed set © = {01,02,...,0n} is called the frame of 
discernment of fusion problem. All the elements in © are 
mutually exclusive and exhaustive. The set of all subsets of 


!DST (Dempster-Shafer Theory) [7] or DSmT (Dezert-Smarandache Theory) [8]. 
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© is called the power set of ©, and it is denoted 2°. For 
instance, if © = {01,02,03}, then 2° = {0, 01, 02,03,01 U 
02,01 U 03, 02 U 03, 01 U 02 U 03}. A basic belief assignment 
(bba), also called mass of belief, is a mapping m : 2° — [0,1] 
associated to a given body of evidence B such that m(@) = 0 
and ` 4c20 m(A) = 1. The credibility (also called belief) 
of A C @ is defined by Bel(A) = ` ge2° M(B). The 


commonality function q(.) and the Sinadibility madon Pi.) 
are also defined by Shafer in [7]. The functions m/(.), Bel(.), 
q(.) and PI(.) are in one-to-one correspondence. 

Let mı(.) and mo(.) be two bba’s provided by two in- 
dependent bodies of evidence 6; and By over the frame of 
discernment ©. The fusion/combination of m,(.) with ma(.), 
denoted m(.) [mı © mə](.) is obtained in DST with 
Dempster’s rule of combination as follows: 











m(0) =0 
Pi rece ə (1) 
a eS Or © 
m(A) maa OS) YA#0,A€2 
X19X240 


The degree of conflict between the bodies of evidence By, 
and By is defined by 


me (0) 


> 


X1,X2€2° 
X1NX2=0 


my(X1)m2(X2) (2) 


Dempster’s rule can be directly extended to the combination 
of S independent and equally reliable sources. It is a commu- 
tative and associative rule of combination and it preserves the 
neutral impact of the vacuous belief assignment defined by 
Mvyba(O) = 1. 


B. Main alternatives to Dempster’s rule 


Dempster’s rule yields counterintuitive results when the 
evidences highly conflict because of its way of assigning the 
mass of conflicting belief mg (Ø). Thus, a lot of alternatives 
to Dempster’s rule have been proposed for overcoming limita- 
tions of Dempster’s rule. The main alternative rules including 
Yager’s rule [9], DP rule [2] and PCRS [8] are briefly recalled. 


e Yager’s rule: Yager admits the conflicting belief is not 
reliable. So ma(@) is transferred to the total ignorance 
in Yager’s rule. It is given by m() = 0 and for A Æ 0, 


A€ 2° by 
mA) = Vix vere mi(X)m2(¥), for A# © 
XNY=A 
m(O) = m (O)m2(0) + 22x vere m (X)m2(Y) 


XnY=0 
(3) 
Dubois & Prade rule: This rule assumes that if two 
sources of evidence are in conflict, one of them is right 
but we don’t know which one. Thus, if X QY = Q, then 
the mass committed to the set X MY by the conjunctive 
operator should be transferred to X U Y. According to 
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this principle, DP rule is defined by m(@) = 0 and for 
AZ and A € 2° by 


XO mi(X)m2(Y) 


X,Y €22 
XNY=A 


m(A) 


+ Y mi(X)m(¥) (4) 
X,YE2° 
XnNY=0 
XUY=A 
PCRS rule: PCRS transfers the partial conflicting mass 
to the elements involved in the conflict, and it is consid- 
ered as the most mathematically exact redistribution of 
conflicting mass to nonempty sets following the logic of 
the conjunctive rule. PCR5 is defined by m(@) = 0 and 
VA #0, A € 2° by 





m(A)= X ma(X1)me(X2)+ 
X1,X2€2° 
XıNX2 =A 
i mı (A) m(X2) mga A)?mı (X2) ] (5) 
X222 mı (A) + m(X2) mz (A) + mı (X2) 
X2NA=0 


The details, examples and the extension of PCR5 formula 
(5) for S > 2 sources are given in [8]. 


C. Discounting source of evidence 


When the sources of evidences are not considered equally 
reliable, it is reasonable to discount each unreliable source s;, 
i = 1,2,...,S by a reliability factor a; € [0,1]. Following 
the classical discounting method [7], a new discounted bba 
m'(.) is obtained from the initial bba m(.) provided by the 
unreliable source s; as follows 


m'(A)=a;:m(A), A#0 
m' (©) = 1- Z aeze m'(A) 2 
AZO 

a; = 1 means that the total confidence in the source s;, and the 
original bba doesn’t need to be discounted. a; = 0 means that 
the source is s; is totally unreliable and its bba is revised as a 
vacouous bba m'(©) = 1, which will have a neutral impact in 
the fusion process. In practice, the discounting method can be 
used efficiently if one has a good estimation of the reliability 
factor of each source. We show in the next section how one 
can evaluate the relability of a source. 


III. EVALUATING THE RELIABILITY OF EACH SOURCE 


Without prior knowledge on the reliability of the sources 
of evidence, we propose to evaluate the reliability factors of 
each source based on the distance between the bba from a 
given source s; with respect to the others. If the bba of the 
given source, say s; varies too much with respect to the others, 
this source of evidence is considered not reliable and it will 
be discounted before to be combined. We will show further 
how the discounting/reliability factor can be estimated. We 
implicitly assume here that the following principle ’Truth is 
reflected by the majority of opinions” holds. 
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In [3], Jousselme et al. have proposed the following distance 


measure dj(m ,,mz2) between two bba’s* mı = m,(.) and 


my = mzo(.) defined on the same power set 2°: 





1 
dj(mı, mə) = yim = m2) D(m; = M2) (7) 


where D is a 2!°! x 219l positive matrix whose elements are 
defined as Dj; 4 Og where A; and Bj are elements of 
the power set 2°. dj(mj,mz) € [0,1] is a distance which 
measures the similarity between m; and mə considering both 
the values and the relative specificity of focal elements of each 
bba. 

The total degree of conflict me (Ø) obtained from all focal 
elements which are incompatible doesn’t actually capture the 
similarity between bba’s as shown by Martin et al. in [5]. 

If N pieces of evidence mj, Mg, ..., My are combined 
sequentially, two approaches similarly with [5] could be used 
to measure the variation between m; and the others. One 
considers the average value dy between m; and the others 
which is given by 





di~! (m;) = da mj, mji) (8) 


The other one is simply defined as 


d~t (mj) = dy(m;j, mi~) (9) 


where m/~' £ m/~'(.) is obtained by the sequential 


combinination of the bba’s ™mj4(.), mo(.),-.., mj—i(.), i.e. 

mi—*(.) = (((m1 © m2) ® mg) +++ @ m;_1)(.) with a fusion 
rule such as Dempster’s rule, Yager’s rule, DP rule, PCRS, etc. 
The second measure, d3-*(m,), reflects only the difference 
between m; and the combined bba mi! and thus cannot 
precisely measure the similarity between m; and the other 
individual evidences m;(.), ma(.), ..., 7m ;—1(.) because some 
information on specificities of these individual bba’s has been 
lost forever through the fusion process. The following exam- 
ples will show the distinction between these two methods. 
Example 1: Let’s consider the frame of discernment © = 
{A, B,C}, Shafer’s model and the same following bba’s 


my(.):  m,(A a m,(B) = 0.2 
mə(.): m2(A) = 0.5 a )=0.2 


m3(.) : 


The difference between m,(.), for j > 2, and all the bba’s 
m;(.), for i < j according to formula (8) gives d)~*(m,) = 0, 
which shows correctly that m,(.) is identical to the other bba’s 


m,(.), for i < j. If one uses the measure di *(m,) defined 
in (9), one gets the results plotted in Fig. 1. 






0.5, 
—e— d-Dempster 
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Fig. 1: Variation of the similarity measure di~! (m;) based on 


different fusion rules. 


One sees that there exists a variation of the similarity 
measure using all different fusion rules with a trend to certain 
values when j increases. This is a bad behavior since we know 
that m; equals to the others bba’s and we would expect to get 
d3~*(m,) = 0 which unfortunately is not the case. That is the 
main reason why we abondon the use of d? 1(m;) measure 
in the sequel of this work. 

Example 2: Let’s consider the frame of discernment © = 
{A, B,C, D, E}, Shafer’s model, and the following bba’s 


mı(.) : mı(A) = 0.6, mı (B) = mı(C) = 0.1 
mı(D) = m,(E) = 0.1 
mə(.): mə(O©)=1 
ma(.): m3(®) = 1 
m;—1(.) : m;—1(O) =1 
m,;(.) : m,(A) = 0.6, m;(B) =m,(C) = 1, 
m,;(D) = m;(E) = 0.1 


In this example, m;(.) = ma(.), but m,(.) is quite different 
from the others bba’s m;(.), i # 1. The similarity measure 
dj~*(m;) Doves! m,(.), for j > 3 and the bba’s m;(.), 
i < jis @*(m mj) = Huy which shows a trend to 0.2 
when j EN However in such case, one always gets 
using Dempster’s rule, Yager’s rule, DP rule or PCRS rule 
di J—1(m;) = 0. From such very simple example, one sees that 
one cannot detect the dissimilarity of m;(.) with a majority 
of quite distinct bba’s an a ‘im j) measure is used. This 
shows again that ds 1(m;) is actually not very appropriate 
for measuring the silii between a given bba m,(.) and a 
set of bba’s. Therefore we will only consider the measure of 
similarity d)~*(m;) in the sequel. 


Here for notation convenience, we use the usual vectorial notation mı and my (with boldfaced letter) for representing the entire bba’s usually denoted 
mı(.) and ma(.). mı and mə are vectors of dimension 21| x 1. We assume that the bba’s vectors are both ordered using the same order for their 


components. 
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For managing the computational burden in applications, a 
parameter n < j — 1 is introduced in the measure d}~'(m,) 
and we define the new measure: 
min(n,j—1) 


1 
dj(m,;,mj;—;) (10) 


@-'(m;) = ———~ 
(my) min(n, j — 1) 


i=1 

The accuracy and the computational complexity of this 
similarity measure increases when n tends to j — 1. 

Let ’s consider a given discounting tolerance threshold wg 
in [0,1]. If di~! (m;) > wa, it indicates that the bba m;(.) 
will be considered as not similar enough with respect to other 
bba’s and therefore the source s; is considered as unreliable 
and must be discounted before entering in the fusion process. 
The unreliability of the source s; may be caused by a fault 
of the sensor or unexpected noises, condition changes, etc. In 
such case, the bba m,(.) needs to be discounted by formula 
(6). As proposed by Martin et al. in [5], the reliability factor of 
the source sj is chosen as aj; = (1 — di-*(m;)*) A ahere 
the parameter A is defined in the easiest way with A = 1. 
The larger dissimilarity leads to the less reliability factor. 
If dì t(m;) < wa, it means that the dissimilarity between 
m,(.) with other bba’s is acceptable, and there is no need to 
revise/discount m,(.) in such case. 


IV. SELECTION OF COMBINATION RULES 


After evaluating the reliability of the sources, we have to 
select a suitable combination rule. Dempster’s rule is known to 
offer pretty good performances when the combined bba’s are 
not in too high conflict, otherwise when the conflict becomes 
too large it is generally considered safer to use alternative 
rules like Yager’s rule, DP rule, and PCRS rule. The following 
examples show the difference between the different approaches 
for the fusion of sources of evidences. 

Example 3: This is Zadeh’s example [10]. Let’s consider 
© = {A, B,C} with Shafer’s model and the following bba’s 


my,(.) : 
ma(.) : 


One sees that the two sources are in very high conflict 
because the total conflict is meg’ (0) = 0.99. Using Dempster’s 
rule, one gets surprisingly m(B) = 1 which is somehow 
conterintuitive since mı and mə both believe in B with a 
little chance, but the fusion result states that B is the only 
possible solution with certainty, which seems unreasonable’. 
If we use Yager’s rule, DP rule, and PCRS, one gets: 

e Yager’s rule: m(B) = 0.01, m(©) = 0.99 

e DP rule: m(B) = 0.01, m(AU B) = 0.09, 

m(AUC) = 0.81, m(B U C) = 0.09 

e PCRS: m(A) = 0.486, m( B) = 0.028, m(C) = 0.4860 

These results are more reasonable in some sense, but they 
are not the same. Yager’s rule transfers all the conflicting mass 
to total ignorance and produces the least specific result in the 


three rules. DP rule distributes the conflicting mass to the 
union of the involved sets, which makes the uncertainty of the 
result still very large. DP rule produces a less specific result 
than PCR5 but DP is a bit more specific than Yager’s rule. 
PCRS provides the most specific result since A and C share 
the same bba whereas B keeps a very low belief assignment. 
Therefore, in order to avoid to get counterintuitive results, 
it is reasonable to use Yager’s rule, DP rule, or PCRS than 
Dempster’s rule as soon as the level of conflict becomes large. 
The choice among Yager’s rule, DP rule, and PCRS depends 
on the application and the computational resource one has. 
PCRS is very appropriate if a decision has to be made because 
it provides the most specific solution, but PCRS requires the 
most computational burden. Sometimes it better to get less 
specific result if we don’t need to take a clear/precise decision 
in case of high conflict between sources. In such case, Yager’s 
rule and/or DP rule can be used instead. When the level of 
conflict between two bba’s is low Dempster’s rule can be 
used since it offers a good compromise between computational 
complexity and the specificity of the result. 
Example 4: Let’s consider 0 = {A, B,C} and 


my(.): mı(A) = 0.35, mı (B) = 0.38,m1(A U B) = 0.15, 


The conflicts between each pair of bba’s are given by 
me (0) = 0.455, mg’ (Ø) = 0.395, mZ’ (Ø) = 0.395. The 
levels of these conflicts are not too large according and the 
sequential combination m(.) = [[m1 @ m2] @ms](.) using the 
different rules yields 
e Dempster’s rule: m(A) = 0.5868, m(B) = 
m(A U B) = 0.0202, m(C) = 0.0338 
e Yager’s rule: m(A) = 0.3105, m(B) = 0.243, m(C) = 
0.0555, m(A U B) = 0.097, m(A UC) = 0.0455, 
m(©) = 0.2485 
e DP rule: m(A) = 0.3255, m(B) = 0.2295, m(C) = 
0.0435, m(A U B) = 0.1975, m(A UC) = 0.0575, 
m(BU C) = 0.0345, m(O) = 0.112 
e PCRS: m(A) = 0.4889, m(B) = 


0.3592, 


0.3941, m(C) = 
0.0819, m(A U B) = 0.0268, m(A U C) = 0.0083 
All the rules provide reasonable results with assigning the 
largest belief to A, but Dempster’s rule produces the most 
specific result with a less computational effort. Dempster’s 
rule is thus well appropriate when m@(@) is not too large. 


V. ADAPTIVE COMBINATION OF SEQUENTIAL EVIDENCE 


Here we are concerned with the real-time decision-making 
problem from the sequential acquisition of bba’s mj(.), 
M2(.),.-., My(.) defined on a same frame © without any 


3More generally, one can show that Dempster’s rule can become insensitive to the variation of input bba’s to combine - see [8], Vol. 1, Chap. 5, p. 114 for 


example. 
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prior knowledge about reliability of each source. We start 
with mj,(.). When mo(.) is available, one combines it with 
mı(.) by a suitable rule according to the value of m&?’ (Ø) 
without evaluating the reliability of the two sources. When 
m,(.), for j > 3 becomes available at the time j, the reliability 
of the source sj is evaluated and m,(.) is discounted (if 
necessary) by the approach presented in section III. Before 
combining the discounted bba m/(.) (or m,(.) when no 
discounting occurs) with the last updated bba m/~'(.), the 
combination rule is selected according to the value of the 
conflict between m,(.) and mJ~'(.). We use a threshold 
wy. If me(0) < wg, Dempster’s rule is selected because it 
offers a good compromise between complexity and specificity. 
Otherwise, Yager’s rule, DP rule, or PCRS, are selected upon 
the actual application to avoid to get counterintuitive results. 

The tuning of thresholds wg and wg is not easy in general. 
If the thresholds are too large, one takes the risk to get 
counterintuitive results, whereas if they are set to too low 
values the non specificity of the result will become large 
and even will lead to decision-making under big uncertainty. 
Therefore, both thresholds wg and wg need to be determined by 
accumulated experience depending on the actual application . 


VI. NUMERICAL EXAMPLE 


Let us suppose a multisensor-based target identification 
system. From five independent sensors, the system collects five 
pieces of evidence sequentially (actually we consider here 2 
possible bba’s ms,(.) and ms5,(.) for the fifth source). For 
decision-making in real-time, the combination result needs to 
be updated right after the new evidence arrives. The bba’s 
defined on the power set of © = {A, B,C} are as follows 


my(.): m4(A) = 0.8,m,1(B) = 0.1,m1(0) = 0.1 
mə(.): m2(A) = 0.4, m2(B) = 0.25, 
m2(C) = 0.2, m2(BUC) = 0.15, 
m3(.): m3(B) = 0.9,m3(C) = 0.1, 
ma(.): ma4(B) = 0.45, m4(C) = 0.45, m4(BUC) = 0.1, 
ms A(.) ms (A) = 0.5,m54(AU B) = 0.25, 
ms a(C) = 0.1,msa(AU C) = 0.15, 
msp(.): mMs5p(B) = 0.5, m5g(AU B) = 0.25, 
msp(C) = 0.1,msp(BUC) = 0.15. 


The five pieces of evidence are combined sequentially, and 
the results are presented in Table 1. The chosen thresholds are 
wa = 0.6, wg = 0.6 and n = 5. 

All the rules provide reasonable results when combining 
consistent bba’s mı(.) and mo(.). The bba ms(.) is highly 
conflicting with mı(.) and ma(.). If there is no prior in- 
formation about the reliability of the sources, we evaluate 
the reliability of each source according to its variation with 
respect to the others. The average similarity distance between 
m3 and mj, mg is so large that d/~'(m3) > wa. Thus, 
ms3(.) is considered unreliable. If we combine directly (without 
discounting) m3(.) with m?(.) using Dempster, Yager, DP 
or PCR5, one gets a high belief in B with all the rules. 
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With the adaptive rule, the bba of ms3(.) is discounted with 
the reliability factor a = 1 — d/~1(ms) to get m4(.). The 
combination of m4(.) with m7(.) assigns now the highest 
belief in A. This adaptive method is helpful to deal with 
the high conflicts caused by the unreliability of the sources. 
The difference between m4(.) and mj,(.), ma(.), m3(.) is 
below the tolerance threshold, but the value of mg (Ø) between 
ma(.) and m3(.) is very large, and me (Ø) = 0.8334 > wy. 
The result of Dempster’s rule indicates that the most credible 
hypothesis is B, whereas A is not possible to happen, which 
is not reasonable. The results produced by Yager’s rule and 
DP rule selected in adaptive rule is full of uncertainty, and we 
even can’t make a clear decision from them because of their 
ways of distributing the mass of conflicting belief. We can get 
the specific output that most belief focuses on hypothesis A 
only if PCR5 is selected in the adaptive rule. As we can see, 
m ,(.) and ma(.) strongly support the hypothesis A, whereas 
ms3(.) and m4(.) strongly support B. It is not easy to be 
sure what is the true hypothesis. The adaptive rule tends to 
preserve the earlier decision, since it assumes that mj (.) and 
mg2(.) where totally reliable, and then m3(.) is considered 
unreliable and thus discounted. When ms(.) is available, if 
ms(.) strongly supports A as with m5,(.), the combination 
results of all the adaptive rules commit their highest belief 
in A. If ms5(.) strongly supports B as with ms5p,(.), the 
combination results will change and assign the highest belief 
in B. The results produced by the adaptive rule with selecting 
combination rules between Dempster and PCR5 are always 
most specific, which is very useful and helpful for decision- 
making in real-time. The good performance of adaptive rules 
lies in the method of evaluating the reliability of sources and 
the way for automatically selecting suitable combination rules. 


VII. CONCLUSIONS 


An approach for adaptive combination of unreliable sources 
of evidence has been proposed in this paper for combining 
sequentially the sources without prior knowledge on their 
reliabilities. The reliability of each source is evaluated ac- 
cording to its similarity with respect to the others which 
is measured by an average distance of similarity. When a 
source is not reliable enough, its bba is discounted to diminish 
its influence in the fusion process and on decision-making. 
Before the fusion of the sources, the suitable combination 
tule is selected depending on the mass of conflicting belief 
mg (@) and the compromise between the computational burden 
and the specificity of the result one wants to deal with. 
Whenever ma (Ø) is below the tolerance threshold, Dempster’s 
rule can be chosen as a good rule of combination for such a 
compromise. Otherwise, Yager’s rule, DP rule, or PCR5 must 
be selected to avoid to get counterintuitive results. The choice 
among these three rules depends on the application and the 
acceptable risk in decision-making errors. PCRS rule is very 
appropriate to use in general for decision-making because 
it provides the most specific fusion results, but it requires 
more computational resources than other rules. If we want 
to keep uncertain results and don’t necessarily need a very 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


specific decision in case of high conflict between sources, 
Yager’s rule or DP rule can be selected instead. Our numerical 
example shows the interest of the proposed approach. The 
main difficulty however lies in the tuning of the thresholds wa, 
wg and the parameter n involved in its implementation. These 
parameters must be selected by experience depending on the 
application. This approach was based on Shafer’s model, but 
could be extended to other models proposed in DSmT. 
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Fusion of Sources of Evidence 
with Different Importances and Reliabilities 


F. Smarandache 
J. Dezert 
J.-M. Tacnet 


Abstract — This paper presents a new approach for 
combining sources of evidences with different impor- 
tances and reliabilities. Usually, the combination of 
sources of evidences with different reliabilities is done by 
the classical Shafer’s discounting approach. Therefore, 
to consider unequal importances of sources, if any, a 
similar reliability discounting process is generally used, 
making no difference between the notion of importance 
and reliability. In fact, in multicriteria decision con- 
text, these notions should be clearly distinguished. This 
paper shows how this can be done and we provide simple 
examples to show the differences between both solutions 
for managing importances and reliabilities of sources. 
We also discuss the possibility for mixing them in a 
global fusion process. 


Keywords: Information fusion, DSmT, discounting, 
importance, reliability, AHP. 


1 Introduction 


In many real-life fusion problems, one has to deal 
with different sources of information arising from hu- 
man reports, artificial intelligence experts systems 
and/or physical sensors. The information are usually 
imprecise, uncertain, incomplete, qualitative or quanti- 
tative and possibly conflicting. The task of information 
fusion is to combine all the information in such a way 
that one has a better understanding and assessment 
of the situation of the complex problem under consid- 
eration for decision-making support. Several theoret- 
ical frameworks have been proposed in the literature 
(Probability theory, Possibilities theory, Imprecise PT, 
etc) but the most appealing ones are the theories of 
belief functions, originally known as Dempster-Shafer 
Theory (DST) [8] and then extended and refined in 
Dezert-Smarandache Theory (DSmT) [9] for dealing 
with qualitative information, for fusioning highly con- 
flicting sources of evidences, for conditioning evidences, 
etc. Aside the choice of the “best“ rule of combina- 
tion of sources of evidences characterized by their be- 
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lief functions, more specifically by their basic belief as- 
signments (bba’s), or belief masses, the very important 
problem concerns the possibility that sources involved 
in the fusion process may not have the same reliability, 
neither the same importance. The reliability can be 
seen as an objective property of a source of evidence, 
whereas the importance of a source is a subjective prop- 
erty of a source expressed by the fusion system designer. 

The reliability of a source represents its ability to pro- 
vide the correct assessment /solution of the given prob- 
lem. The importance of a source represents somehow 
the weight of importance granted to the source by the 
fusion system designer. The reliability and importance 
represent two distinct notions and the fusion process 
must be able to deal with these notions. We show in 
this paper how this can be done efficiently through two 
discounting techniques using Proportional Conflict Re- 
distribution rules no 5 or no 6 (PCR5 or PCR6) de- 
veloped in DSmT framework. We will show also that 
such solution cannot be used in DST framework us- 
ing Dempster’s rule of combination because Dempster’s 
rule doesn’t respond to our new importance discounting 
(it only responds to reliability discounting! ). 

The importance of a source is particularly crucial 
since it is involved in multi-criteria decision making 
(MCDM) problems, like in the Analytic Hierarchy Pro- 
cess (AHP) developed by Saaty [6, 7]. That’s why it 
is fundamental to show how the importance can be ef- 
ficiently managed in evidential reasoning approaches, 
in particular in DSmT. The fusion system designer is 
still free to make no differences between importance and 
reliability and use the classical discounting technique. 
In general however, one should consider the importance 
and the reliability as two distinct notions and thus they 
have to be processed in different ways. This is the pur- 
pose of this paper. The application of this technique 
in DSmT-AHP is presented in [2] and an application of 
both DSmT and AHP for risk expertise and prevention 
in mountains has been introduced by Tacnet in [11, 12] 


Known as the classical Shafer’s discounting, see [8]. 
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and works are still under progress in this field. 

This paper is organized as follows. After a brief re- 
minder of basics of DSmT for information fusion and 
its main fusion rule in section 2, we present the clas- 
sical discounting technique for combining sources with 
different reliabilities in section 3. In section 4, we intro- 
duce in a new solution for taking into account the pos- 
sible different importances of sources in the fusion pro- 
cess. Section 5 provides simple examples to show and 
compare the results obtained from the two discounting 
approaches. In section 6 we discuss the more general 
problem where one needs to deal with both reliability 
and importance at the same level in the fusion process. 
Conclusions and perspectives are given in section 7. 


2 Basics of DSmT 


Let © = {61, 62,--- , On} be a finite set of n elements 
0i, i = 1,...,n assumed to be exhaustive. © corre- 
sponds to the frame of discernment of the problem un- 
der consideration. In general (unless introducing some 
integrity constraints), we assume that elements of © 
are non exclusive in order to deal with vague/fuzzy and 
relative concepts [9], Vol. 2. This is the so-called free- 
DSm model. In DSmT framework, there is no need in 
general to work on a refined frame consisting in a dis- 
crete finite set of exclusive and exhaustive hypotheses? 
because DSm rules of combination work for any models 
of the frame, i.e. the free DSm model (no exclusive con- 
straint between 0;, Shafer’s model (all 6; are exclusive) 
or any hybrid model (only some 6; are truly exclusive). 
The power set 2° is defined as the set of all proposi- 
tions built from elements of © with U [8]; © generates 
2° under U. The hyper-power set (Dedekind’s lattice) 
D® is defined as the set of all propositions built from 
elements of © with U and N; © generates DÌ? under U 
and N, see [9] Vol. 1 for many detailed examples. The 
super-power set (Boolean algebra) S® is defined as the 
set of all propositions built from elements of O with U 
and N and complement c(.); © generates S? under U, N 
and c(.), see [9] Vol. 3. S° can be seen as the minimal 
refined frame of ©. For notation convenience, we use 
the generic notation G° to represent the fusion space 
under consideration depending on the application and 
the underlying model chosen for the frame ©; which 
can be either G® can be either 2°, D® or S°. In DST 
framework, G° = 2°, whereas in DSmT we usually 
work with G° = D®. 

A (quantitative) basic belief assignment (bba) ex- 
pressing the belief committed to the elements of G® 
by a given source/body of evidence is a mapping func- 
tion m(-): GE — [0,1] such that: m(@) = 0 and 
aego m(A) = 1. Elements A € G® having m(A) > 0 
are called focal elements of the bba m(.). The general 
belief and plausibility functions are defined respectively 


2Referred as Shafer’s model in the literature. 
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in almost the same manner as Shafer in [8], i.e. 


Bel(A)= X` m(B) (1) 
BEG®,BCA 
P(A)= Š, m(B) (2) 


BEG®S,BNAZ#Z0 


In DSmT, the Proportional Conflict Redistribution 
Rule no. 5 (PCR5) has been proposed as a serious 
alternative of Dempster’s rule [8] for dealing with 
conflicting belief functions. It has been also clearly 
shown in [9], Vol. 3, chap. 1 that Smets’ rule? is not 
so efficient, nor cogent because it doesn’t respond to 
new information in a global or in a sequential fusion 
process. Indeed, very quickly Smets’ fusion result 
commits the full mass of belief to the empty set!!! 
Therefore in applications, some ad-hoc numerical 
techniques must be used to circumvent this serious 
drawback. Such problem doesn’t occur with PCR5 
rule. By construction, other well-known rules like 
Dubois & Prade, or Yager’s rule, and contrariwise to 
PCR5, increase the non-specificity of the result. An 
introduction to DSmT and PCR5 fusion rule with 
justification and several examples can be found in [9], 
Vol. 3, Chap. 1, freely downloadable from the web. 


Definition of PCR5 (for two sources): Let’s mi(.) 
and m2(.) be two independent* bba’s, then the PCR5 
rule of combination for two sources of evidence is de- 


fined as follows (see [9], Vol. 2 for details, justification 
and examples): mpcrs(0) = 0 and VA € G® \ {0} 


mpcrs(A) = 5 m,(X1)m2(X2)+ 
Sokal 
stm) Wal), eae) AE) 
XNA=0 


All fractions in (3) having zero denominators are 
discarded. In DSmT, we consider all propositions/sets 
in a canonical form. We take the disjunctive normal 
form, which is a disjunction of conjunctions, and it is 
unique in Boolean algebra and simplest. For example, 
X =ANBN(AUBUOC) it is not in a canonical form, 
but we simplify the formula and X = AN B is ina 
canonical form. Like most of fusion rules®, PCR5 is not 
associative and the optimal fusion result is obtained 
by combining the sources altogether at the same time 
when possible. Some of PCR5 properties can be 
found in [1] and it allows non-Bayesian reasoning. An 
extension of PCR5 for combining qualitative bba’s can 


3 
4 


i.e. the non normalized Dempster’s rule. 
i.e. each source provides its bba independently of the other 
sources. 

5Except Dempster’s rule, and conjunctive rule in free DSm 
model. 
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be found in [9], Vol. 2 & 3. 


Basically, the idea of PCR5 is to transfer the conflict- 
ing mass only to the elements involved in the conflict 
and proportionally to their individual masses, so that 
the specificity of the information is entirely preserved 
through this fusion process. For example: consider two 
bba’s mı(.) and mo(.), AN B = Ý for the model of ©, 
and m1(A) = 0.6 and m2(B) = 0.3. With PCR5 the 
partial conflicting mass m,(A)m2(B) = 0.6 - 0.3 = 0.18 
is redistributed to A and B only with respect to the 
following proportions respectively: £A 0.12 and 
xp = 0.06 because the proportionalization requires 


TA LB _ m1(A)me2(B) -018 o2 


Variant of PCR5 (PCR6): The extension and a vari- 
ant of (3), called PCR6 has been proposed by Mar- 
tin and Osswald in [9], Vol. 2, for combining s > 2 
sources and for working in other fusion spaces is pre- 
sented in [9]. For two sources, PCR6 coincides with 
PCR5. The difference between PCR5 and PCR6 lies 
in the way the proportional conflict redistribution is 
done as soon as three or more sources are involved in 
the fusion. For example, let’s consider three sources 
with bba’s mi(.), me(.) and m3(.), AN B = O for the 
model of the frame O, and mı (A) = 0.6, m2(B) = 0.3, 
m3(B) = 0.1. With PCR5 the partial conflicting mass 
m,(A)m2(B)m3(B) = 0.6-0.3-0.1 = 0.018 is redis- 
tributed back to A and B only with respect to the 
following proportions respectively: ae = 0.01714 
and ee = 0.00086 because the proportionalization 
requires 





wc wpe __ mi (A)ma(B)ms(B) 
mi(A) — m2(B)m3(B) mı(A) + m2(B)ms(B) 
that is 
PCR5 „PCRS 
TA LR 0.018 
= 4 = — — z 0.02857 
0.6 0.03 0.6 + 0.03 
thus 


CR = 0.60 - 0.02857 ~ 0.01714 
BCR = 0.03 - 0.02857 ~ 0.00086 


With the PCR6 fusion rule, the partial conflicting mass 
m,(A)m2(B)m3(B) = 0.6 - 0.3 -0.1 = 0.018 is redis- 
tributed back to A and B only with respect to the fol- 
lowing proportions respectively: ee = 0.0108 and 
TEF RG = 0.0072 because the PCR6 proportionalization 
is done as follows: 




















sAn apa% epg"? mi(A)ma(B)ma(B) 

mı(A) ma(B) m(B) —-mi(A) + m2(B) + ms(B) 

that is 

hehe _ io 2 ae = 0.018 iii 
0.6 0.3 0.1 0.6 +0.3+0.1 ` 
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thus 
ahCRS = 0.6 - 0.018 = 0.0108 
rB = 0.3 - 0.018 = 0.0054 
rB = 0.1 - 0.018 = 0.0018 


and therefore with PCR6, one gets finally the following 
redistributions to A and B: 


PCR6 _ 
xĘOR6 — 0.0108 
fete = 255" + «BS = 0.0054 + 0.0018 = 0.0072 


From the implementation point of view, PCR6 is much 
more simple to implement than PCR5 (see Appendix). 


3 Reliability discounting 


Reliability refers to information quality while impor- 
tance refers to subjective preferences of the fusion sys- 
tem designer. The reliability of a source represents its 
ability to provide the correct assessment/solution of 
the given problem. It is characterized by a discount- 
ing reliability factor, usually denoted a in [0,1], which 
should be estimated from statistics when available, or 
by other techniques [3]. This reliability factor can be 
context-dependent. For example, if one knows that 
some sensors do not perform well under bad weather 
conditions, etc, one will decrease the reliability factor 
of information arising from that source accordingly. By 
convention, we usually take a = 1 when the source is 
fully reliable and a = 0 if the source is totally unre- 
liable. Reliability of a source is generally considered® 
through Shafer’s discounting method [8], p. 252, which 
consists in multiplying the masses of focal elements by 
the reliability factor a, and transferring all the remain- 
ing discounted mass to the full ignorance ©. When 
a < 1, such very simple reliability discounting tech- 
nique discounts all focal elements with the same factor 
qa and it increases the non specificity of the discounted 
sources since the mass committed to the full ignorance 
always increases. Mathematically, Shafer’s discounting 
technique for taking into account the reliability factor 
a € [0,1] of a given source with a bba m(.) and a frame 
© is defined by: 


-m(X), for X #0 
-m(®) + (1 — a) (3) 


4 Importance discounting 


The importance of a source is not the same as its re- 
liability and it can be characterized by an importance 
factor, denoted 8 in [0,1]. 8 factor represents some- 
how the weight of importance granted to the source by 
the fusion system designer. The choice of 8 is usually 
not related with the reliability of the source and can be 
chosen to any value in [0, 1] by the designer for his/her 


6More sophisticated methods have been also proposed, see 


[4, 5] for example. 
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own reason. By convention, the fusion system designer 
will take 6 = 1 when he/she wants to grant the max- 
imal importance of the source in the fusion process, 
and will take 8 = 0 if no importance at all is granted 
to this source in the fusion process. Typically, if one 
has a pool of experts around a table to take important 
decision, say politicians, scientific researchers, military 
officers, etc, it is possible that one wants to grant more 
importance to the voice of a given politician (say the 
President) rather than to a military officers or a scien- 
tific researcher, even if the scientific researcher is more 
reliable in his expertise field than other people. Such 
situations occur frequently in real-life problems. The 
fusion designer must be able to deal with importance 
factors in a different way than with reliability factors 
since they correspond to distinct properties associated 
with a source of information. 

The main question we are concerned in this paper is 
how to deal with different importances of sources in the 
fusion process in such a way that a clear distinction is 
made/preserved between reliability and importance ? 

Our preliminary investigations were based on the 
self/auto-combination of the sources. For example, 
if one has the importances factors 61 = 0.7 for the 
source sı and 62 = 0.3 for the source s2, one could 
imagine to combine 7 times the bba mj,(.) with it- 
self, combine 3 times the bba mə(.) with itself, and 
then combine the resulting auto-fusioned bba’s because 
such combination would reflect somehow the relative 
importance of the source in the fusion process since 
(1/82 = 0.7/0.3 = 7/3. Actually such approach is very 
disputable and cannot be used satisfactorily in practice 
whatever the fusion rule is adopted. It can be easily 
shown that the auto-conflict tends quickly to 1 after 
several auto-fusions [3]. In other words, the combina- 
tion result of N x 6, bba’s mı(.) with M x B2 bba’s 
M2(.) is almost the same for any N and M sufficiently 
large, so that the different importances of sources are 
not well preserved in such approach. The numerical 
complexity of such method must be pointed out since 
it would require to compute possibly many auto-fusions 
of each source which is a very time-consuming computa- 
tional task. For example, if 61 = 0.791 and 62 = 0.209, 
it would require to combine at least 791 auto-fusions of 
mı(.) with 209 auto-fusions of ma(.) !!! 

In this paper, we propose a better solution to con- 
sider importances of sources. Our new approach can 
be considered as the dual of Shafer’s discounting ap- 
proach for reliabilities of sources. The idea was origi- 
nally introduced briefly by Tacnet in [9], Vol.3, Chap. 
23, p. 613. It consists to define the importance dis- 
counting with respect to the empty set rather than the 
total ignorance © (as done by Shafer in reliability dis- 
counting presented in section 3). Such new discounting 
technique allows to deal easily with sources of different 
importances and is also very simple to use as it will be 
shown. 
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Definition (importance discounting): We define the 
importance discounting of a source having the impor- 
tance factor 8 and asociated bba m(.) by 


fe (X) = 
s0) = 


Note that with this importance discounting approach, 
we allow to deal with non-normal bba since mg(0) > 0. 
The interest of this new discounting is to preserve the 
specificity of the primary information since all focal 
elements are discounted with same importance factor 
and no mass is committed back to partial or total 
ignorances. Working with positive mass of belief on 
the empty set is not new and has been introduced in 
nineties by Smets in his transferable belief model [10]. 
Here we use the positive mass of the empty set as an 
intermediate/preliminary step of the fusion process. 
Clearly when 8 = 1 is chosen by the fusion designer, it 
will mean that the source must take its full importance 
in the fusion process and so the original bba m(.) is 
kept unchanged. If the fusion designer takes 3 = 0, one 
will deal with mg(@) = 1 which must be interpreted 
as a fully non important source. m(Ø) > 0 is not 
interpreted as the mass committed to some conflicting 
information (classical interpretation), nor as the mass 
committed to unknown elements when working with 
the open-world assumption (Smets interpretation), but 
only as the mass of the discounted importance of a 
source in this particular context. 


m(X), for X #9 


et eras 6) 


Before going further, it is worth to note that 
Dempster’s rule cannot deal properly with importance 
discounted bba’s proposed in (5) because our impor- 
tance discounting technique preserves the specificity 
of the primary information and Dempster’s rule does 
not make a difference in results when combining mj (.) 
with mo(.) or when combining mg, 41(.) with ma,¢1(.) 
due to the way of processing the total conflicting mass 
of belief. This can be stated as the following theorem: 


Theorem 1: Dempster’s rule is not responding to 
the discounting of sources towards the emptyset. 
Proof: Let mı(.) and m2(.) be two bba’s defined 
on the fusion space G® {X1, X2,..., Xn}. Let 
mı(Xı) = a; for all i, with X; a; = 1, and all 
a, in [0,1], and let mə(X:) = b; for all i, with 
dL, bs = 1, and all b; in [0,1]. mı (Ø) = mo() = 0. 
After discounting both mı(.) and mə2(.) towards the 
emptyset with 6ı and respectively 62 in [0,1], we 
get: (91)mi (Xi) = (G1)a; for all i, with SY, a; = 1, 
and all a; in [0,1], also (Gimi = 1- fi, And 
(82)m2(X;) (B2)b; for all 2, with Sori bi = i 
and all b; in [0,1], also (G2)mi(@) = 1 — b2. If we 
apply the conjunctive rule to mı(.) and m2(.) we get: 
my2(X;) = ci, with )77_, ci = 1 and c; in [0,1], where 
some X; could be empty intersections. Suppose the 
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non-empty resulted sets after applying the conjunctive 
rule are: X;,,..., Xip Then Dempster’s rule gives 
mps(Xi,) = ™12(Xi, )/(mi2(Xi,) +.. +m12(Xi,)), for 
1 < k < p. If we apply the conjunctive rule to (1)m(.) 
and (2)m2(.) we get: (91)(G2)mi2(Xi) = (81) (b2)ci, 
with ((1)(G2)c; in [0,1], where some X; could be 
empty sets, and ((31)(G2)m12(0") = 1 — (61)(62). Now 
Dempster’s rule normalizes the conjunctive result of 
non empty sets by dividing the mass of each nonempty 
set by the sum of all non-empty sets. The non-empty 
resulted sets after applying the conjunctive rule are the 





same: Xa, ..., Xip Then: (61)(b2)Mmps(Xi,) = 
(31) ((B2)mr2(Xi,,)/((G1)(G2)mi2(Xi) + -+ 
(61)(82)Mm12(X. J = myo(Xi,)/(mi2(Xy) +... + 
m42(Xi,)) = mps(Xi,) since the whole fraction is sim- 


plified by (81)(62), for 1 < k < p. Hence Dempster’s 
rule is not responding to the discounting of sources 
towards the empty set. 


From Theorem 1, one understands why such impor- 
tance discounting technique has never been proposed 
and used in DST framework and this explains why the 
classical Shafer’s discounting technique (the reliability 
discounting) has only been largely adopted so far. 
By using Dempster’s rule, the fusion designer has no 
other choice but to consider importance and reliability 
as same notions! As it will be shown, the DSmT 
framework with PCR5 (or PCR6) rule and the impor- 
tance discounting technique proposed here provides 
an interesting and simple solution for the fusion of 
sources with different importances which makes a clear 
distinction between importances and reliabilities of 
sources. 


Fusion of importance discounted bba’s: Based on 
this new discounting technique, it is however very sim- 
ple to adapt PCR5 or PCR6 fusion rules for combining 
the s > 2 discounted bba’s associated with each source 
i, i = 1,2,...s. It suffices actually to consider the fol- 
lowing extension of PCR5, denoted PCR5g and defined 
by: 

e For two sources (s = 2): VA € GÈ (A may be the 
empty set too) 


5 mı(Xı)m2(X2)+ 
X1, X2€G° 
XıNX2=A 


MPCR5,(4) = 


1 m(A)}’mı 
(A)+m(X) ma(A)+mı 


Tie. the absolute empty set, not that resulted from set inter- 


sections which are empty. 
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e For s > 2 sources: VA € G® (A may be the empty 
set too) 


MPCR59(A) = M12...5(A)+ 


i Ss o aE E 

Fipa Tt poe 

1< {j2 IEP ({1,...,n}) 
rı<r2<...<rt—1<(r+=s) Ann 8 


ines eae ae s}) 

(Tenn Min (AY?) Ma Mira Mie, (Xa) 

Cie Miky (A)) + [Mie ge =r; 1+1 Miz, (X; )] 
(7) 


k, r, s and t in (7) are integers. 
D xi, X2,...,XeeG® Ii- mi(Xi) is the 





where i, j, 
mMı2...s(A) = 


conjunctive consensus on A between s sources 
and where all denominators are different from 
zero. If a denominator is zero, that fraction is 
discarded; P*({1,2,...,n}) is the set of all subsets 
of k elements from {1,2,...,n} (permutations 
of n elements taken by k), the order of elements 
doesn’t count. 


A similar extension can be done for the PCR6 for- 
mula for s > 2 sources given in [9], Vol. 2. More pre- 
cisely for any A in G? (A may be the empty set too) 
we define: 


= mı2...s(A)+ 


ss Ss 


ae X2; seat X,-1€G® k=! (i1,i2,...,i8)€P(1,2,...,8) 


MPC R6y (A) 


(N Reyes 
[Mmi (A) + mi, (A) +... + mi, (A)| x 
IMi- mi (4) Mien Mi, (Xp) 


et el a A fg 
Ta mi, (A) + a= k+1 Mip (Xp) A 


where P(1,2,...,s) is the set of all permutations of 
the elements {1,2,...,s}. It should be observed that 
Xı, X2, ..., Xs—1 may be different from each other, 


or some of them equal and others different, etc. 


As a particular case for s = 3 sources, one gets for 
any A in G® (A may be the empty set too): 


MPCR6p (A) = ™123(A) + X X 
x,yeG® (41, %2,i3)€ P(1,2,3) 
XAA,YFZA 
XNYNA=0 


Mi, (A)? mi, (X)mis (Y) 
Mix (A) + Miz (X) + Mis (Y) 


+ [mi (A) + mj, (A)] - ere 


(9) 
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where 7 23(A) is the mass of the conjunctive consensus 
on A and P(1, 2,3) is the set of all permutations of the 
elements {1,2,3}. It should be observed that X may 
be different or equal to Y. 

The difference between formulas (3) and (6) is that 
mpcrs(0) = 0 whereas mpcrs,(0) > 0. Of course, 
since we usually need to work with normal bba’s for 
decision-making support, the result mpors,(.) , or 
MPCR6,(-), Of the fusion of discounted masses mg, (.) 
will be normalized by redistributing the mass of belief 
committed to the empty set to the other focal elements 
and proportionally to their masses (see next example). 


5 Example 


For convenience and simplicity, and due to space lim- 
itation constraint, we give a very simple example work- 
ing on the classical power set 2° since most of read- 
ers familiar belief functions usually work with this fu- 
sion space. Example 1: Let’s consider © = {A, B}, 
Shafer’s model, and two sources with respectively bba’s 
my(.) and mo(.) given by mı(A) = 0.8, mi(B) = 0.2 
and m2(A) = 0.4, m2(B) = 0.6. 


e Case 1 (no importance discounting): Let’s con- 
sider that 6ı = 1 and 2 = 1, ie. the sources 
must have the same maximal importance in the 
fusion rule. In that case, one gets: mg,(.) = 
my,(.) and mg,(.) = me(.) and the bba’s are ac- 
tually not discounted. The conjunctive rule gives 
mı2(A) = 0.32, mi2(B) = 0.12 and the mass 
mı2(A N B = Ø) = 0.56 is redistributed back to 
A and B proportionally to their masses following 
the PCR5 principle explained in section 2. We get 
the following result: 


Lf maar) mover) | mist) | meces() | 
T io 

eI 0. a 

0.6 0.36 


0.32 
0.12 





Table 1: PCR5 fusion of mg,=1(.) with mg,=1(.). 


e Case 2 (with importance discounting): Let’s take 
now the importances factors 3; = 0.2 and G2 = 0.8 
(note that in general we don’t need to force the 
sum of 3; to be one, unless one wants to deal with 
relative importances between sources). Applying 
importance discounting technique and normaliza- 


tion of mpcrs,, denoted MPC R59 C) one gets: 


norm 
mMgB= 0.2( M6o= 0.8( MPC ha 


0.80 0.20 a z 
0. Ms 
0.57 


0.16 0.32 0.0512 
0.04 0.48 0.0192 





Table 2: PCR5 fusion of mg, =0.2(.) with mg,=o.s( 
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Clearly, one sees in Table 2 the strong impact of the 
importance discounting on the result with respect 
to what we obtain in Table 1 (i.e. without impor- 
tance discounting). Note also that the difference 
is very different to what we would have obtained 
by taking a, = 0.2 and ag = 0.8 and using the 
reliability cat ce as seen in Table 3. 


L Mar= =e Mag= =o. T 





Table 3: PCR5 fusion of ma, =0.2(.) with ma,=0.8(.). 


By comparing Table 2 with Table 3, one sees the 
clear difference in results obtained by these two 
discounting techniques which is normal. 


6 Reliability and importance 


In this section, we examine the possibility to take into 
account both the reliabilities a; and the importances 
Bi of given sources of evidence characterized by their 
bba’s m;(.), i = 1,2,...,s. The main question is how 
to deal with these two distinct discounting factors since 
in general, but when a; = ĝi = 1, the reliability and 
importance discounting approaches do not commute. 
Indeed, it can be easily verified (see in next example) 
that Ma; B: (-) Æ Mg,,a;(.) whenever a; # 1 and 6; £1. 
Ma,,6;(.) denotes the reliability discounting of m;(.)by 
a; followed by the importance discounting of mq, (.) 
by i which explains the notation a;, 3; used in index. 
Similarly, mg, ,«,(.) denotes the importance discounting 
of m;(.) by i followed by the reliability discounting of 
mg,(.) by ay. To deal both with reliabilities and impor- 
tances factors and because of the non commutativity of 
these discountings, we propose to proceed the fusion of 
the sources in a three-steps process as follows: 


Method 1: Step 1: Apply reliability and then impor- 
tance discountings to get ma,,a,(.), i = 1,...,s and 
combine them with PC'R5g or PC’R6g and normalize 
the resulting bba; Step 2: Apply importance and then 
reliability discountings to get mg,,o,(.),i=1,...,s and 
combine them with PC'R5g or PC'R6g and normalize 
the resulting bba; Step 3 (mixing/averaging): Combine 
the resulting bba’s of Steps 1 and 2 using the arithmetic 
mean operatorë. 

Method 2: Another simplest method which could be 
useful for intermediate traceability in some applications 
would consist to change Steps 1 & 2 by Step 1’: Apply 
reliability discounting only to get ma,(.) and combine 
them with PCR5 or PCR6; Step 2’: Apply importance 
discounting only to get mg,(.) and combine them with 


8Other combination rules could be used also like PCR5 or 
PCR6, etc., but we don’t see solid justification to use them again 
and they require more computations than the simple arithmetic 
mean. 
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PCR5g or PCR6g and normalize the result; Step 3’ 
same as Step 3. Due to space limitation, only Method 
1 is briefly illustrated in the following simple example. 


Example 2: Let’s take 0 = {A, B, C}, Shafer’s model, 
three sources mı (.), M2(.) and ms(.) given in next table 
and assume that their reliability factors are a; = 0.8, 
a2 = 0.5, and a3 = 0.2 and their importance factors 
are By = 0.9, b2 = 0.3 and Bs = 0.6. 


l 
ho 


e œ 
Nue 
wR 


U 


H 
o 


oopooooo0oool 
H 


cooopopool 


0 

A 
B 
A 
Cc 
A 
B 
A 


CCE 
wana w 
C 

Q 

n 





Table 4: Sources of evidences. 


By applying reliability followed by importance dis- 
counting, and by applying importance followed by reli- 
ability discounting, one gets: 


00.5760 
0 
0.0720 
0 
0.0720 
0 
0.1800 





Table 6: Importance-Reliability discounting. 


The normalized results of the PC'R5g fusion of 
Ma,;,;(.) for i = 1,2,3 (Step 1) and PC'R5g fusion of 
M6,,a;(.) for i = 1,2,3 (Step 2) is given in next Table 
7 with their arithmetic mean ™pcrs(.) (Step 3). 


7 Conclusions 


The proposition of two different discounting tech- 
niques is an important contribution to consider both 
preferences and reliability in fusion problems for deci- 
sion making purposes. In this paper, we have proposed 
a new solution for taking into account the different im- 
portances of sources of evidence in their combination. 
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[Step 1 [Step Steps 


norm 


MPCR59 ga) || MPcRs(-) 


MPC R5q aB (.) 


Table 7: Results of Steps 1, 2 & 3. 


We have shown the clear distinction between the clas- 
sical reliability discounting technique and our new im- 
portance discounting method which can be used with 
extensions of PCR5 and PCR6 fusion rules developed in 
DSmT framework. It has been shown also that Demp- 
ster’s rule cannot be applied satisfactorily with this im- 
portance discounting approach contrariwise to PCR5 
and PCR6 rules. The importance and reliability can 
now be distinguished in the fusion of sources which in- 
troduces a link with Multi-Criteria Decision Problems 
in the fusion of sources of information. Applications 
of these techniques for risk prevention against natural 
catastrophes in mountains are under progress and re- 
sults will be published in forthcoming publications. 
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Appendix: Matla code listings 


for PCR5 and PCR6 


For convenience, we provide two Matlab™ routines 
for PCR5 and PCR6 for the fusion of s > 2 sources 
for working with 2°, i.e. working with Shafer’s model. 
Some adaptations need to be done to work on other 
fusion spaces and to work with PC R5g and PC Rôg. 
No verification of input is done in the routines. It 
is assumed that the input matrix BBA is correct, 
both in dimension and in content. No attempt for 
fast computation, nor memory optimization is done 
in these very simple and basic codes. The deriva- 
tion of all possible combinations in the loop with 
combvec(Combinations,vec) instruction is a very 
time-consuming task when the size of the problem 
increases and should be done once outside the routines. 
The j-th column of the BBA input matrix corresponds 
to the (vertical) bba vector m,;(.) associated with the 
j-th source s;. Each element of a BBA matrix is in 
[0,1] and the sum of each column must be one. If N is 
the cardinality of the frame © and if S is the number 
of sources, then the size of the BBA input matrix is 
((2%) — 1)) x S. Each column of the BBA matrix 
must use the following binary encoding of elements? 
of 2° \ {Ø}. For example, if © = {A,B,C}, then 
binary sequence 001 = A, 010 = B, 011 = AUB.,..., 
111 = AU BUC. These codes can be used and shared 
for free for research purposes only. Commercial uses of 
these codes, or any adaptation of them, is not allowed 
without written agreement of the author. The use of 
these codes are at the own risk of the user. 


®Since one always considers normal input bba’s such that 
m (0) = 0, j = 1,...S, one doesn’t need to store these val- 
ues in the BBA matrix. For PCR5g and PC'R6g however, one 
needs to include as first row of BBA the m;(0) > 0 resulting 
from importance discounting of the sources and make a proper 
adaptation of indexes in the routines. 
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File : PCR5fusion.m 


function [mPCR5,TotalConflict]=PCR5fusion(BBA) 
% Author and copyrights: Jean Dezert 

% Input: BBA matrix 

% Output: mPCR5 = resulting bba after fusion with PCRS 
% TotalConflict = level of total conflict between sources 
NbrSources=size(BBA,2) ; 
CardTheta=log2(size(BBA,1)+1); 

if (NbrSources==1) 
mPCR5=BBA(: ,1) ;TotalConflict=0;return 

end 

Card2PowerTheta=2" (CardTheta)-1; 

% All possible combinations 
vec=([1:Card2PowerTheta] ; 

Combinations=vec; 

for s=1:NbrSources-1 

Combinations=combvec (Combinations, vec) ; 

end 

Combinations=Combinations’ ; 

mPCR5=zeros (Card2PowerTheta, 1) ; 
TotalConflict=0; 

NbrComb=size (Combinations, 1); 

for c=1:NbrComb 

PC=Combinations(c,:); 
mConj=zeros(1,NbrSources) ; 

for s=1:NbrSources 

mConj (s)=BBA(PC(s) ,s) ; 

end 

massConj=prod(mConj ,2) ; 

if (massConj>0) 

% Check if this is a real partial conflict or not 
Intersections=PC(1) ; 

for s=2:NbrSources 

X=PC(s) ; 

Intersections=bitand (Intersections ,X) ; 

end 

if(Intersections~=0) % the intersection is not empty 
mPCR5 (Intersections) =mPCR5 (Intersections) +massConj; 
else % the intersection is empty 
TotalConflict=TotalConflict+massConj; 

% Let’s apply PCRS rule principle 
UQ=unique(PC) ; 

Proportions=0+UQ; 

DenPCR5=0; 

for u=1:size(UQ,2) 
SamePropositions=find(PC==UQ(u)) ; 
MassProd=prod (mConj (SamePropositions)) ; 
Proportions(u)= MassProd+massConj; 
DenPCR5=DenPCR5+MassProd; 

end 

Proportions=Proportions/DenPCR5; 

% PCRS redistribution 

for u=1:size(UQ,2) 

mPCR5 (UQ (u) )=mPCR5 (UQ (u) )+Proportions(u) ; 

end, end, end, end, return 


File : PCR6fusion.m 


function [mPCR6,TotalConflict]=PCR6fusion (BBA) 
% Author and copyrights: Jean Dezert 

% Input: BBA matrix 

% Output: mPCR6 = resulting bba after fusion with PCR6 
% TotalConflict = level of total conflict between sources 
NbrSources=size(BBA,2) ; 
CardTheta=1og2(size(BBA,1)+1); 

if (NbrSources==1) 

mPCR6=BBA(: 1); 

TotalConflict=0; 

return 

end 

Card2PowerTheta=27 (CardTheta)-1; 

% All possible combinations 
vec=([1:Card2PowerTheta] ; 

Combinations=vec; 

for s=1:NbrSources-1 

Combinations=combvec (Combinations, vec) ; 

end 

Combinations=Combinations’ ; 

mPCR6=zeros (Card2PowerTheta, 1) ; 
TotalConflict=0; 

NbrComb=size (Combinations, 1); 

for c=1:NbrComb 

PC=Combinations(c,:); % particular combination 
mConj=zeros(1,NbrSources) ; 

for s=1:NbrSources 

mConj (s)=BBA(PC(s) ,s) ; 

end 

massConj=prod(mConj ,2) ; 

if (massConj>0) 

Intersections=PC(1) ; 

for s=2:NbrSources 

X=PC(s) ; 

Intersections=bitand (Intersections ,X) ; 

end 

if(Intersections~=0) % intersection not empty 
mPCR6 (Intersections) =mPCR6 (Intersections) +massConj; 
else % empty intersection 
TotalConflict=TotalConflict+massConj; 

% PCR6 rule principle 

for s=1:NbrSources 

Proportion= mConj(s)*(massConj/(sum(mConj ,2))); 
% Redistribution back to element PC(s) 
mPCR6(PC(s) )=mPCR6(PC(s) )+Proportion; 

end, end, end, end, return 
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Is Entropy Enough to Evaluate the Probability 
Transformation Approach of Belief Function? 


Deqiang Han 
Jean Dezert 
Chongzhao Han 
Yi Yang 


Abstract — In Dempster-Shafer Theory (DST) of ev- 
idencee and transferable belief model (TBM), the prob- 
ability transformation is necessary and crucial for 
decision-making. The evaluation of the quality of the 
probability transformation is usually based on the en- 
tropy or the probabilistic information content (PIC) 
measures, which are questioned in this paper. Another 
alternative of probability transformation approach is 
proposed based on the uncertainty minimization to ver- 
ify the rationality of the entropy or PIC as the evalua- 
tion criteria for the probability transformation. Accord- 
ing to the experimental results based on the comparisons 
among different probability transformation approaches, 
the rationality of using entropy or Probabilistic Infor- 
mation Content (PIC) measures to evaluate probability 
transformation approaches is analyzed and discussed. 


Keywords: TBM, uncertainty, pignistic probability 
transformation, evidence theory, decision-making. 


1 Introduction 


Evidence theory, as known as Dempster-Shafer Theory 
(DST) [1,2] can reason with imperfect information in- 
cluding imprecision, uncertainty, incompleteness, etc. 
It is widely used in many fields in information fusion. 
There are also some drawbacks and problems in evi- 
dence theory, i.e. the high computational complexity, 
the counter-intuitive behaviors of Dempster’s combi- 
nation rule and the decision-making in evidence the- 
ory, etc. Several modified, refined or extended mod- 
els were proposed to resolve the problems aforemen- 
tioned, such as transferable belief model (TBM) [3] pro- 
posed by Philippe Smets and Dezert-Smarandache The- 
ory (DSmT) [4] proposed by Jean Dezert and Florentin 
Smarandache, etc. 
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The goal of uncertainty reasoning is the decision- 
making. To take a decision, the belief assignment val- 
ues for a compound focal element should be at first 
assigned to the singletons. So the probability transfor- 
mation from belief function is crucial for the decision- 
making in evidence theory. The research on probability 
transformation has attracted more attention in recent 
years. 

The most famous probability transformation in evi- 
dence theory is the pignistic probability transformation 
(PPT) in TBM. TBM has two levels including credal 
level and pignistic level. At the credal level, beliefs 
are entertained, combined and updated while at the 
pignistic level, the PPT maps the beliefs defined on 
subsets to the probability defined on singletons, then a 
classical probabilistic decision can be made. In PPT, 
belief assignment values for a compound focal element 
are equally assigned to the singletons belonging to the 
focal element. In fact, PPT is designed according to 
principle of minimal commitment, which is somehow 
related with uncertainty maximization. But the goal 
of information fusion at decision-level is to reduce the 
uncertainty degree. That is to say more uncertainty 
might not be helpful for the decision. PPT uses equal 
weights when splitting masses of belief of partial un- 
certainties and redistributing them back to singletons 
included in them. Other researchers also proposed some 
modified probability transformation approaches [5-13] 
to assign the belief assignment values of compound fo- 
cal elements to the singletons according to some ra- 
tio constructed based on some available information. 
The typical approaches include the Sudano’s probabili- 
ties [8] and the Cuzzolin’s intersection probability [13], 
etc. In the framework of DSmT, another probability 
transformation approach was proposed, which is called 
DSmP [9]. DSmP takes into account both the values of 
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the masses and the cardinality of focal elements in the 
proportional redistribution process. DSmP can also be 
used in Shafer’s model within DST framework. 

In almost all the research works on probability trans- 
formations, the entropy or Probabilistic Information 
Content (PIC) criteria are used to evaluate the prob- 
ability transformation approaches. Definitely for the 
purpose of decision, less uncertainty should be better 
to make a more clear and solid decision. But does the 
probability distribution generated from belief functions 
with less uncertainty always rational or always be ben- 
efit to the decision? We do not think so. In this pa- 
per, an alternative probability transformation approach 
based on the uncertainty minimization is proposed. 
The objective function is established based on the Shan- 
non entropy and the constraints are established based 
on the given belief and plausibility functions. The ex- 
perimental results based on some provided numerical 
examples show that the probability distributions gen- 
erated based on the proposed alternative approach have 
the least uncertainty degree when compared with other 
approaches. When using the entropy or PIC to evalu- 
ate the proposed probability transformation approach, 
the probability distribution with the least uncertainty 
seemingly should be the optimal one. But some risky 
and strange results can be derived in some cases, which 
are illustrated in some numerical examples. It can be 
concluded that the entropy or PIC, i.e. the uncertainty 
degree might not be enough to evaluate the probability 
transformation approach. In another word, the entropy 
or PIC might not be used as the only criterion to make 
the evaluation. 


2 Basics of evidence theory and 
probability transformation 
2.1 Basics of evidence theory 


In Dempster-Shafer theory [2], the elements in the 
frame of discernment (FOD) © are mutually exclusive. 
Define the function m : 2° — [0,1] as the basic prob- 
ability assignment (BPA, also called mass function), 
which satisfies: 


es m(A) = 1, m(@) =0 


Belief function and plausibility function are defined 
respectively in (2) and (3): 


Bel(A) = Doses m(B) 


(1) 


(2) 


PIA) = > gg ™(B) (3) 


and Dempster’s rule of combination is defined as fol- 
lows: M1, M2, ...,; Mn are n mass functions, the new com- 
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bined evidence can be derived based on (4) 





0, A=90 
I] mil) 
m(A) = NA;=A 1l<i<n (4) 
Tada? APY 


NA, AO 1<i<n 


Dempster’s rule of combination is used in DST to 
accomplish the fusion of bodies of evidence. But the fi- 
nal goal of the information fusion at decision-level is to 
make the decision. The belief function (or BPA, plausi- 
bility function) should be transformed to the probabil- 
ity, before the probability-based decision-making. Al- 
though there are also some research works on making 
decision directly based on belief function or BPA [14], 
probability-based decision methods are the develop- 
ment trends of uncertainty reasoning and theories [15]. 
This is because the two-level reasoning and decision 
structure proposed by Smets in his TBM is appealing. 


2.2 Pignistic transformation 


As a type of probability transformation approach, the 
classical pignistic probability in TBM framework was 
coined by Philippe Smets. TBM is a subjective and 
non probabilistic interpretation of evidence theory. It 
extends the evidence theory to the open—world propo- 
sitions and it has a range of tools for handling belief 
functions including discounting and conditioning, etc. 
At the credal level of TBM, beliefs are entertained, com- 
bined and updated while at the pignistic level, beliefs 
are used to make decisions by transforming beliefs to 
probability distribution based on pignistic probability 
transformation (PPT). The basic idea of the pignistic 
transformation consists in transferring the positive be- 
lief of each compound (or nonspecific) element onto the 
singletons involved in that element split by the cardi- 
nality of the proposition when working with normalized 
BPAs. 

Suppose that © = {61,6o,...,4n} is the FOD. The 
PPT for the singletons is illustrated as follows [3]: 


2 


6:€B, BC2° 


m(B) 


(5) 


where 2° is the power set of the FOD. Based on the 
pignistic probability derived, the corresponding deci- 
sion can be made. 

But in fact, PPT is designed according to the idea 
being similar to uncertainty maximization. In general, 
the PPT is just a simple averaging operation. The mass 
value is not assigned discriminately to the different sin- 
gletons involved. But for information fusion, the aim is 
to reduce the degree of uncertainty and to gain a more 
consolidated and reliable decision result. The high un- 
certainty in PPT might be not helpful for the decision. 
Several researchers aim to modify the traditional PPT. 
Some typical modified probability transformation ap- 
proaches are as follows. 
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1) Sudano’s probabilities: Sudano [8] proposed 
some interesting alternatives to PPT denoted by PrP], 
PrNPI, PraPl, PrBel and PrHyb, respectively. Sudano 
uses different kinds of mappings either proportional to 
the plausibility, to the normalized plausibility, to all 
plausibilities and to the belief, respectively or a hybrid 
mapping. 


2) Cuzzolin’s intersection probability: In the 
framework of DST, Fabio Cuzzolin [13] proposed an- 
other type of transformation. From a geometric inter- 
pretation of Dempster’s combination rule, an intersec- 
tion probability measure was proposed from the propor- 
tional repartition of the total non specific mass (TNSM) 
by each contribution of the non-specific masses involved 
in it. 


3) DSmP: Dezert and Smarandache proposed the 
DSmP as follows: Suppose that the FOD is © = 
{61, ..., On}, the DSmP,(0;)can be directly obtained by: 


DSmP.(@;) = m({0;}) + (m({6:}) + €)- 


m(X) 
(Lao) ©) 
dcx YS 
|X|>2. |y[=1 


In DSmP, both the values of the mass assignment and 
the cardinality of focal elements are used in the pro- 
portional redistribution process. DSmP does an im- 
provement of all Sudano, Cuzzolin, and BetP formulas, 
in the sense that DSmP mathematically makes a more 
accurate redistribution of the ignorance masses to the 
singletons involved in ignorance. DSmP works in both 
theories: DST and DSmT as well. 


There are still some other definitions on modified 
PPT such as the iterative and self-consistent approach 
PrScP proposed by Sudano in [5] and a modified PrScP 
in [12]. Although the approaches aforementioned are 
different, all the probability transformation approaches 
are evaluated based on the degree of uncertainty. Less 
uncertainty means that the corresponding probability 
transformation result is better. According to such a 
idea, the probability transformation approach should 
attempt to enlarge the belief differences among all the 
propositions and thus to derive a more reliable decision 
result. Is this definitely rational? Is the uncertainty 
degree always proper or enough to evaluate the prob- 
ability transformation? In the following section, some 
uncertainty measures are analyzed and an alternative 
probability transformation approach based on uncer- 
tainty minimization is proposed to verify the rationality 
of the uncertainty degree as the criteria for evaluating 
the probability transformation. 
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3 An alternative probability 


transformation based on un- 
certainty minimization 


3.1 Evaluation criteria for probability 
transformation 


The metrics depicting the strength of a critical decision 
by a specific probability distribution are introduced as 
follows: 

1) Normalized Shannon entropy 

Suppose that pg is a probability distribution, where 
0 € ©, |O| = N and the |O| represents the cardinality 
of the FOD ©. The evaluation criterion for the proba- 
bility distribution derived based on different probability 
transformation is as follows [12]. 
= PD po loga (po) 

loga N (7) 


The dividend in (7) is the Shannon entropy and the di- 
visor in (7) is maximum value of the Shannon entropy 
for {pe|0 € O},|O| = N. Obviously Ey is normalized. 
The larger the Ey is, the larger the degree of uncer- 
tainty is. The less the Ey is, the less the degree of un- 
certainty is. When Ep= 0, there is only one hypothesis 
has a probability value of 1 and the rest has 0, the agent 
or system can make decision correctly. When Ey= 1, 
it is impossible to make a correct decision, because all 
the pe, VO € © are equal. 

2) Probabilistic Information Content 

Probabilistic Information Content (PIC) criterion is 
an essential measure in any threshold-driven automated 
decision system. A PIC value of one indicates the total 
knowledge to make a correct decision. 


5 Po loga (po) 
OEO 

Obviously, PIC = 1 — Ey. The PIC is the dual of 
the normalized Shannon entropy. A PIC value of zero 
indicates that the knowledge to make a correct deci- 
sion does not exist (all the hypotheses have an equal 
probability value), i.e. one has the maximal entropy. 

As referred above, for information fusion at decision- 
level, the uncertainty seemingly should be reduced as 
much as possible. The less the uncertainty in prob- 
ability measure is, the more consolidated and reliable 
decision can be made. Suppose such a viewpoint is 
always right and according to such an idea, an alter- 
native probability transformation of belief function is 
proposed. 


Ex = 








PIC(P) =1 . 
OP) =1+ Boy (8) 


3.2 Probability transformation of belief 
function based on uncertainty min- 
imization 

To accomplish the probability transformation, the be- 

lief function (or the BPA, the plausibility function) 
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should be available. The relationship between the prob- 
ability and the belief function are analyzed as follows. 
Based on the viewpoint of Dempster and Shafer, the 
belief function can be considered as a lower probabil- 
ity and the plausibility can be considered as an upper 
probability. Suppose that pg € [0,1] is a probability 
distribution, where 0 € ©. For a belief function defined 
on FOD O, suppose that B € 2°, the inequality (9) is 
satisfied: 
Bel(B) < ODEI = (B) (9) 
This inequality can be proved according to the proper- 
ties of the upper and lower probability. 
Probability distributions (pọ|0 € ©} also must meet 
the usual requirements for probability distributions, i.e. 


{ 0<pp <1,VOEO 
Deca Pe = 1 


It can be taken for granted that there are several proba- 
bility distributions {pọ|0 € O} consistent with the given 
belief function according to the relationships defined in 
(9) and (10). This is a multi-answer problem or one-to- 
many mapping relation. As referred above, the proba- 
bility is used for decision, so the uncertainty seemingly 
should be as little as possible. We can select one prob- 
ability distribution from all the consistent alternatives 
according to the uncertainty minimization criterion and 
use the corresponding probability distribution as the re- 
sult of the probability transformation. 

The Shannon entropy is used here to establish the 
objective function. The equations and inequalities in 
(9) and (10) are used to establish the constraints. The 
problem of probability transform of belief function here 
is converted to an optimization problem under con- 
straints as follows: 


(10) 


— ¥ po logy iwn)} 


Min { 
{pele} GEO 
Bel(B) < Yoep po < PUB) 
0<po <1,VOEO 


Disco Po = 1 


Given belief function (or the BPA, the plausibility), by 
solving (11), a probability distribution can be derived, 
which has least uncertainty measured by Shannon en- 
tropy and thus is seemingly more proper to be used in 
decision procedure. 

It is clear that the problem of finding a minimum en- 
tropy probability distribution does not admit a unique 
solution in general. The optimization algorithm used 
is the Quasi-Newton followed by a global optimization 
algorithm [16] to alleviate the effect of the local ex- 
tremum problem. Other intelligent optimization algo- 
rithms [17,18] can also be used,such as Genetic Algo- 
rithm (GA), Particle Swarm Optimization (PSO), etc. 


(11) 
S.t. 
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4 Analysis based on examples 


At first, two numerical examples are first provided to 
illustrate some probability transformation approaches. 
To make the different approaches reviewed and pro- 
posed in this paper more comparable, the examples 
in [6,12] are directly used here. The PIC is used to 
evaluate the probability transformation. 


4.1 Example 1 


For FOD © = {61, 42, 03,04}, the corresponding BPA 
is as follows: 





m({01}) = 0.16, m({62}) = 0.14, m({63}) = 0.01, 
m({04}) = 0.02, 

m({0 ,02}) = 0.20, m({01,03}) = 0.09, 
m({01,04}) = 0.04, m({02,3}) = 0.04, 
m({02,04}) = 0.02, m({03,04}) = 0.01, 

m({0 , 02, 03}) = 0.10, m({01, 02, 04 }) = 0.03, 
m({01, 03, 04+) = 0.03, m({62, 03, 04}) = 0.03, 
m(@) = 0.08. 


The corresponding belief functions are calculated and 
listed as follows: 





Bel({6,}) = 0.16, Bel({02}) = 0.14, 

Bel({03}) = 0.01, Bel({04}) = 0.02, 
Bel({61,02}) = 0.50, Bel({61,03}) = 0.26, 
Bel({61,01}) = 0.22, Bel({62,03}) = 0.19, 
Bel({62,04}) = 0.18, Bel({03,04}) = 0.04, 
Bel({0,, 62, 03}) = 0.74, Bel({01, 02, 64}) = 0.61, 
Bel({01,03,04}) = 0.36, Bel({02, 03, 04}) = 0.27, 
Bel(Q) = 1.00. 


The corresponding plausibility functions are calcu- 
lated and listed as follows: 


PI({01}) = 0.73, PI({02}) = 0.64, PI({03}) = 0.39, 
PI({04}) = 0.26, 
PI({01,02}) = 0.96, PI({01, 03}) = 0.82, 
PIENE = 0.81, Pl({02,03}) = 0.78, 
PU({02,04}) = 0.74, PI({03,04}) = 0.50, 
PU({01, 2, 63}) = 0.98, PI({01, 02, 04}) = 0.99, 
PU (OO 04}) =0.86, PI({02, 03, 4}) = 0.84, 
PI(©) = 1.00. 


Suppose the probability distribution as the unknown 
variables. Based on the plausibility functions and the 
belief functions, the constraints and the objective func- 
tion can be established according to (11). The probabil- 
ity distribution can be derived based on the minimiza- 
tion. The results of some other probability transforma- 
tion approaches are also calculated. All the results are 
listed in Table 1 (on the next page) to make the com- 
parison between the approach proposed in this paper 
(denoted by Un_min) and other available approaches. 





4.2 Example 2 

For FOD © = {61, 02,03,04}, the corresponding BBA 
is as follows: 

m({01}) = 0.05, m({92}) = 0.00, m({63}) = 0.00, 
m({04}) = 0.00, 
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Table 1 Probability Transformation Results of 
Example 1 based on Different Approaches 


Table 2 Probability Transformation Results of 
Example 2 based on Different Approaches 

















































































































04 b2 03 04 PIC 0i 02 03 04 PIC 
BetP [3] 0.3983) 0.3433) 0.1533) 0.1050) 0.0926 PrBel [8] N/A due to 0 value of singletons 
PraPI [8] 0.4021) 0.3523) 0.1394) 0.1062} 0.1007 FPT(11 N/A due to 0 value of singletons 
PrPI [8] 0.4544) 0.3609 0.1176) 0.0671) 0.1638 PrScP [10] | N/A due to 0 value of singletons 
PrHyb [8] | 0.4749 0.3749) 0.0904) 0.0598) 0.2014 PrBP1 [12] | N/A due to 0 value of singletons 
PrBel [8] 0.5176 0.4051) 0.0303) 0.0470) 0.3100 PraP1 [8] 0.4630, 0.2478} 0.1561) 0.1331) 0.0907 
FPT[11] 0.5176 0.4051) 0.0303) 0.0470| 0.3100 BetP [3 0.4600) 0.2550} 0.1533} 0.1317) 0.0910 
DSmP_0[9] | 0.5176) 0.4051) 0.0303) 0.0470} 0.3100 PrPI [8] 0.6161) 0.2160} 0.0960} 0.0719 0.2471 
PrScP [10] | 0.5403) 0.3883) 0.0316) 0.0393} 0.3247 PrBP2 [12] | 0.6255} 0.2109} 0.0936) 0.0700} 0.2572 
PrBP1 [12] | 0.5419} 0.3998) 0.0243) 0.0340) 0.3480 PrHyb [8] | 0.6368) 0.2047) 0.0909) 0.0677] 0.2698 
PrBP2 [12] | 0.5578) 0.3842) 0.0226) 0.0353) 0.3529 DSmP_O[9] | 0.5162} 0.4043) 0.0319) 0.0477} 0.3058 
PrBP3 [12] | 0.0605] 0.3391) 0.0255) 0.0309} 0.3710 PrBP3 [12] | 0.8823) 0.0830) 0.0233) 0.0114} 0.5449 
Un_min 0.7300) 0.2300) 0.0100 0.0300) 0.4813 Un_min 0.9000, 0.0900} 0.0000} 0.0100) 0.7420 





























m({61, b2}) = = 0. 39, m({61,63}) = = 0. 19, 

m({01, 04}) = 0.18, m({02, 03}) = 0.04, 

m({02,04}) = 0.02, m({63,04}) = 0.01, 

m({01, b2,03}) = = 0. 04, m({01, 02,04}) = = 0. 02, 

m({01, 03, 04}) = 0.03, m({02, 03, 04}) = 0.03, 
m(©) = 0.00. 


The corresponding belief functions are calculated and 
listed as follows: 


Bel({0:}) = 0.05, Bel({02}) = 0.00, 

Bel({03}) = 0.00, Bel({64}) = 0.00, 

Bel {61, 82%) = 0.44, Bel({01, 63}) = 0.24, 

Bel {61, 04}) = 0.23, Bel({O2, 63}) = 0.04, 

Bel {02, b4 }) = 0.02, Bel({03, 64}) = 0.01, 

Bel {61, b2, 03}) = 0.71, Bel({01, 02, 84 }) = 0.66, 
Bel {61, 03, O4}) = 0.46, Bel({02, 03, 04 }) = 0.10, 
Bel(©) = 1.00. 


The corresponding plausibility functions are calcu- 
lated and listed as follows: 
A }) = 0.90, Pl({@2}) = 0.54, PI({03}) = 0.34, 
PI({04}) = 0.29, 
PI({O,, 02}) = 0.99, Pl({A1,03}) = 0.98, 
PI({O1, 04}) = 0.96, Pl({O2,03}) = 0.77, 
PI({02,04}) = 0.76, Pl({A3, 04}) = 0.56, 
PU 
PU 
PIO 





{61, 42, 03}) = 1.00, Pl({61, 62, A4}) = 1.00, 
{91,43, 04}) = 1.00, PI({02, 03, 04}) = 0.95, 
) = 1.00. 


Suppose the probability distribution as the unknown 
variables. Based on the plausibility functions and the 
belief functions, the constraints and the objective func- 
tion can be established according to (11). The probabil- 
ity distribution can be derived based on the minimiza- 
tion. The results of some other probability transfor- 
mation approaches are also calculated. All the results 
are listed in Table 2 to make the comparison between 
the approach proposed in this paper and other available 
approaches. 

N/A in Table 2 means ”Not available”. 
means the parameter £ in DSmP is 0. 





DSmP_0 
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Based on the experimental results listed in Table 1 
and Table 2, it can be concluded that the probabil- 
ity derived based on the proposed approach (denoted 
by Un-min) has significantly lower uncertainty when 
compared with the other probability transformation ap- 
proaches. The difference among all the propositions can 
be further enlarged, which is seemingly helpful for the 
more consolidated and reliable decision. 


Important remark: In fact, there exist fatal deficien- 
cies in the probability transformation based uncertainty 
minimization, which are illustrated in following exam- 
ples. 


4.3 Example 3 
The FOD and BPA are as follows [4]: 
O = {61, b2}, m({61}) = 0.3, 
m({02}) = 0.1, ,m({01, 02}) = 0.6 
Based on different approaches, the experimental re- 


sults are derived as listed in Table 3 


Table 3 Probability Transformation Results of 
Example 3 based on Different Approaches 









































A, b2 PIC 
BetP 0.6000 0.4000 0.0291 
PrPl 0.6375 0.3625 0.0553 
PraPl 0.6375 0.3625 0.0553 
PrHyb 0.6825 0.3175 0.0984 
DSmP_0.001) 0.7492 0.2508 0.1875 
PrBel 0.7500 0.2500 0.1887 
DSmP_0 0.7500 0.2500 0.1887 
Un_min 0.9000 0.1000 0.5310 








DSmP_0 means the parameter £ in DSmP is 0 and 
DSmP_0.001 means the parameter £ in DSmP is 0.001. 

Is the probability transformation based on PIC max- 
imization (i.e. entropy minimization) rational ? 

It can be observed, in our very simple example 
3, that all the mass of belief 0.6 committed {6,, 62} 
is actually redistributed only to the singleton {01} 
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using the Un_min transformation in order to get the 
maximum of PIC. 


A deeper analysis shows that with Un_min transfor- 
mation, the mass of belief m{61, 02} > 0 is always fully 
distributed back to {0} as soon as m({01}) > m({62}) 
in order to obtain the maximum of PIC (i.e. the min- 
imum of entropy). Even in very particular situations 
where the difference between masses of singletons is 
very small like in the following example: 

© = {61,02}, m({61}) = 0.1000001, 

m({02}) = 0.1, m({61, 62}) = 0.7999999. 

This previous modified example shows that the prob- 
ability obtained from the minimum entropy principle 
yields a counter-intuitive result, because m({6,}) is 
almost the same as m({02}) and so there is no solid rea- 
son to obtain a very high probability for 6; and a small 
probability for 62. Therefore, the decision based on the 
result derived from Un_min transformation is too risky. 
Sometimes uncertainty can be useful, and sometimes it 
is better to not take a decision than to take the wrong 
decision. So the criterion of uncertainty minimization 
is not sufficient for evaluating the quality /efficiency of a 
probability transformation. There are also other prob- 
lems in the probability transformation based on uncer- 
tainty minimization principle, which are illustrated in 
our next example. 


4.4 Example 4 


The FOD and BPA are as follows: 
with , 


m({O1, A2}) = m({O2, A3}) = m({A1, O3}) = 1/3. 


Using the probability transformation based on uncer- 
tainty minimization, we can derive six different proba- 
bility distributions yielding the same minimal entropy, 
which are listed as follows: 


(S) = {61, b2, 03}, 





P({01}) =1/3, P({92}) = P({O3}) = 0; 
P({01}) = fe P({O2}) = P({O3}) = 2/3; 
P({O1}) = P({2}) = We P({O3}) = 2/3; 
P({O1}) = P({O2}) = 2/3, P({93}) = 1/3; 
P({01}) = 3 P({02}) = m P({O3}) = 0; 
P({01}) =2/3, P({92}) = P({03}) = 1/3. 


It is clear that the problem of finding a probabil- 
ity distribution with minimal entropy does not admit a 
unique solution in general. So if we use the probabil- 
ity transformation based on uncertainty minimization, 
there might exist several probability distributions de- 
rived as illustrated in this Example 4. How to choose 
a unique one? In Example 4, depending on the choice 
of the admissible probability distribution, the decision 
results derived are totally different which is a serious 
problem for decision-making support. 
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From our analysis, it can be concluded that the max- 
imization of PIC criteria (or equivalently the minimiza- 
tion of Shannon entropy) is not sufficient for evaluating 
the quality of a probability transformation and other 
criteria have to be found to give more acceptable prob- 
ability distribution from belief functions. The search 
for new criteria for developing new transformations is 
a very open and challenging problem. Until finding 
new better probability transformation, we suggest to 
use DSmP as one of the most useful probability trans- 
formation. Based on the experimental results shown in 
Examples 1-3, we see that the DSmP can always be 
computed and generate a probability distribution with 
less uncertainty and it is also not too risky, i.e. DSmP 
can achieve a better tradeoff between a high PIC value 
(i.e. low uncertainty) and the risk in decision-making. 


5 Conclusion 


Probability transformation of belief function can be 
considered as a probabilistic approximation of belief 
assignment, which aims to gain more reliable decision 
results. In this paper, we focus on the evaluation crite- 
ria of the probability transformation function. Experi- 
mental results based on numerical examples show that 
the maximization of PIC criteria proposed by Sudano 
is insufficient for evaluating the quality of a probability 
transformation. More rational criteria have to be found 
and to better justify the use of a probability transfor- 
mation with respect to another one. 


All the current probability transformations devel- 
oped so far redistribute the mass of partial ignorances 
to the belief of singletons included in it. The redistri- 
bution is based either only on the cardinality of partial 
ignorances, or eventually also on a proportionalization 
using the masses of singletons involved in partial igno- 
rances. However when the mass of a singleton involved 
in a partial ignorance is zero, some probability trans- 
formations, like Cuzzolin’s transformation by example, 
do not work at all and that’s why the € parameter has 
been introduced in DSmP transformation to make it 
working in all cases. In future, we plan to develop 
a more comprehensive and rational criterion, which 
can take both the risk and the uncertainty degree into 
consideration, to evaluate the quality of a probability 
transformation and to find an optimal probability 
distribution from any basic belief assignment. 
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Algebraic Generalization of Venn Diagram 


Florentin Smarandache 


Abstract. 


It is easy to deal with a Venn Diagram for 1 <n <3 sets. When n gets larger, the picture 
becomes more complicated, that's why we thought at the following codification. That’s 
why we propose an easy and systematic algebraic way of dealing with the representation 
of intersections and unions of many sets. 


Introduction. 


Let's first consider 1 <n < 9, and the sets Si, S2, ..., Sn. 

Then one gets 2"-1 disjoint parts resulted from the intersections of these n sets. Each part 
is encoded with decimal positive integers specifying only the sets it belongs to. Thus: 
part 1 means the part that belongs to Sı (set 1) only, part 2 means the part that belongs to 
S2 only, ..., part n means the part that belongs to set Sn only. 

Similarly, part 12 means that part which belongs to Sı and S2 only, i.e. to S1MS2 only. 

Also, for example part 1237 means the part that belongs to the sets Si, S2, S3, and S7 only, 
i.e. to the intersection Sı1NS2NS3NS7 only. And so on. This will help to the construction 
of a base formed by all these disjoint parts, and implementation in a computer program of 
each set from the power set A(S: S2 ... Sn) using a binary number. 

The sets S1, S2, ..., Sn, are intersected in all possible ways in a Venn diagram. Let 1 < k < 
n be an integer. Let’s denote by: i12...1x the Venn diagram region/part that belongs to the 
sets Si; and Si2 and ... and Six only. forall Kand all n. The part which is outside of all sets 
(i.e. the complement of the union of all sets) is noted by 0 (zero). Each Venn diagram will 
have 2” disjoint parts, and each such disjoint part (except the above part 0) will be formed 
by combinations of k numbers from the numbers: 1, 2, 3, ..., n. 


Example. 


Let see an example for n = 3, and the sets S1, S2, and S3. 


O 


23 


Fig. 1. 
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Unions and Intersections of Sets. 


This codification is user friendly in algebraically doing unions and intersections in a simple 


way. 
Union of sets Sa, Sp, ..., Sv is formed by all disjoint parts that have in their index either the 
number a, or the number b, ..., or the number v. 


While intersection of Sa, Sb, ..., Sv is formed by all disjoint parts that have in their index all 
numbers a, b, ..., v. 

For n = 3 and the above diagram: 

S1u823 = {1, 12, 13, 23, 123}, i.e. all disjoint parts that include in their indexes either the 
digit 1, or the digits 23; 

and SıNS2 = {12, 123}, i.e. all disjoint parts that have in their index the digits 12. 


Remarks. 


When n > 10, one uses one space in between numbers: for example, if we want to represent 
the disjoint part which is the intersection of S3, S10, and S27 only, we use the notation [3 10 
27], with blanks in between the set indexes. 

Depending on preferences, one can use other character different from the blank in 
between numbers, or one can use the numeration system in base n+l, so each 
number/index will be represented by a unique character. 
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Importance of Sources using the Repeated Fusion 
Method and the Proportional Conflict Redistribution 
Rules #5 and #6 


Florentin Smarandache 
Jean Dezert 


Originally published as a scientific note 2010 in HAL archive. https:// 
hal.archives-ouvertes.fr/hal-00471839. Printed with permission. 


Abstract. 

We present in this paper some examples of how to compute by hand the PCRS fusion rule for 
three sources, so the reader will better understand its mechanism. 

We also take into consideration the importance of sources, which is different from the classical 


discounting of sources. 





1. Introduction. 


Discounting of Sources. 
Discounting a source m;(.) with the coefficient 0 < a < / and a source m2/(.) with a coefficient 


0 <P <1 (because we are not very confident in them), means to adjust them to m;’(.) and m2’(.) 
such that: 

m,’(A) =am,(A) for A # © (total ignorance), and m; (©) = am)(@)+ I-a, 

and m7'(A) = 6 m2(A) for A # © (total ignorance), and m2’(©) = f.m2(© )+ 1- P. 


Importance of Sources using Repeated Fusion. 
But if a source is more important than another one (since a such source comes from a more 


important person with a decision power, let’s say an executive director), for example if source 
m2(.) is twice more important than source m;(.), then we can combine m;(.) with m2(.) and with 
m2(.), SO We repeated m2(.) twice. Doing this procedure, the source which is repeated (combined) 
more times than another source attracts the result towards its masses — see an example below. 
Jean Dezert has criticized this method since if a source is repeated say 4 times and other source is 
repeated 6 times, then combining 4 times m;(.) with 6 times m2(.) will give a result different from 
combining 2 times m;(.) with 3 times m2/(.), although 4/6 = 2/3. In order to avoid this, we take 
the simplified fraction n/p, where gcd(n, p) =1, where gcd is the greatest common divisor of the 
natural numbers n and p. 

This method is still controversial since after a large number of combining n times m;(.) with p 
times m2(.) for n+p sufficiently large, the result is not much different from a previous one which 
combines n; times m;(.) with p; times m2(.) for n;+p, sufficiently large but a little less than n+p, 
so the method is not well responding for large numbers. 
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A more efficacy method of importance of sources consists in taking into consideration the 
discounting on the empty set and then the normalization (see especially paper [1] and also[2]). 


2. Using m,.,; for 3 Sources. 
Example calculated by hand for combining three sources using PCRS fusion rule. 
Let’s say that m, (.) is 2 times more important than m, (.) ; therefore we fusion m;(.), 


m2), m2(.). 
A B AUB|A{MB=® 


m 0.1 07 02 
m, 04 01 05 
m, 04 O1 05 
m, 0.193 0.274 0.050| 0.483 







Xa ST Yig n 25 AUB Tae 0.005 io 0.05 


01 07 05 07 7 
x,, = 0.000714 


yış = 0.000714 
| z, ug = 0.003572 











Di- Pa- Zaur 0.14-0.07 07 
04 07 05 16 08 8 
x,, = 0.035000 


Yop = 0.061250 
| Zoaye = 0.043750 











X4 Yap _Zzaus _ 0.008 _ 0.08 
04 01 02 07 7 
x,, = 0.004571 


Y,p = 0.001143 
| z; us = 0.002286 











X44 Yap _ Zaun _ (0.4)(0.1)(0.2) _ 0.008 _ 0.08 
04 01 02 0.7 ô T 
x,, = 0.004571 


Vay = 0.001143 
| Zusye = 9.002286 
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Xsa sp _Zsaug _ 0.14 14 


04 0.7 05 16 16 
x,, = 0.035000 


Yop = 0.061250 
| Zsuue = 0.043750 








Xea _ Yeg _ Zeaug _ 9-005 _ 0.05 
0.1 01 0.5 0.7 7 
x4 = 0.000714 


Ven = 0.000714 
| Zeaya = 0.003572 














X4 Vag. (0.1(0.1)(0.1) _ 0.001 
0.1 (0.10.1) 0.14001 0.11 
x,, = 0.000909 


Ven = 0.000091 


Xs4 Yep (0.4)(0.7)(0.1) _ 0.028 _ 2.8 
0.4 (0.7)0.1) 0.14001 047. 47 
x,, = 0.023830 


Yep = 0.004170 





Xy, = Xz, = 0.023830 
Yor = Yap = 0.004170 


(0.1)(0.4)(0.1) _ 0.004 0.4 0.2 
(0.10.4) 0.1 0.04401 0.14 14 7 
Xo, = 0.001143 
Vex = 0.002857 


Moa _ Vow _ 





Xilas = Xios = 0.001143 
Vig = Yiog = 9.002857 
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Xia Yiog _ (0.4)(0.4)(0.7) _ 0.112 _ 11.2 
(0.10.4) 0.1 0.16+0.7 0.86 86 
x), = 0.020837 
Yz = 0.091163 





A B AUB 
mos” 0.345262 0.505522 0.149216 


If we didn’t double m>(.) in the fusion rule, we’d get a different result. 
Let’s suppose we only fusion m;(.) with m2(.): 


A B AUB A{MB=® 
m 01 0.7 0.2 
m, 04 0.1 0.5 
m, 0.17 0.44 0.10 0.29 
mE 0.322 0.668 0.100 0 


And now we compare the fusion results: 


A B AUB 
PCRS 


m 0.345 0.506 0.149 - three sources(sec ond — source — doubled); importance of sources considered; 


mi 0.322 0.668 0.100 - two sources; importance of sources not considered. 


The more times we repeat m2(.) the closer mè (4) > m2(A)=0.4, m S (B) > m2(B)=0.1, and 
mi (A UB) > m2(A UB)=0.5. Therefore, doubling, tripling, etc. a source, the mass of each 


element in the frame of discernment tends towards the mass value of that element in the repeated 
source (since that source is considered to have more importance than the others). 


For the readers who want to do the previous calculation with a computer, here it is the mpegs 
Formula for 3 Sources: 
2 
m,(A) m,(X )m,(Y 
Mpcrs(A) = My, + >: x ) A ) d ) + 
X,YeG® m,(A)+m, (X)+m (Y) 


A#X#Y#A 
ANXNY=9 


m,(Y)m,(A) m (X) R m,(X)m,(Y)m,(A) 


m,(Y)+m,(A)+m,(X)m,(X)+m,(Y)+m,(A) | 
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XeG° 
A(\X=® 


5 m (Afm (X)m (X)  m(X)m (4f m(X) _ m(X)m(X)m (47 } 
) 


m,(A)+m,(X)+m,(X) ‘a A)+m,(X) m(X)+m,(X)+m,(A4 





XeG° 
ANX=9 


Ly aerate Pal )me(A) m (A? m (A) m (X)m (4) 


m,(A)+m,(A +m,(X) m,(X)+m, (A)+m, (A) m,(A)+m,(X)+m,(A) 


3. Similarly, let’s see the /”7cx Ror mula tom A POMECES: 





Mpcre(4) = Mp; + y 
X,YeG° 
A#X#Y#A 
ANXNY=0 


m (Y m, (A) m, (X) y m,(X)m,(Y)m,(A) 


| m (A) m (X)m (Y) 


m,(A)+m,(X)+m,(Y) 





2 


+ 





m,(A) m, X)m (X) i m,(X)m,(A) m,(X) : m (X)m (X)m, (A) 











4. A General Formula for PCR6 for $ È 2 Sources. 


S— 


1 
mrm + >. D [m,(A4) 4m, (4) +...+m, (4) | 
X Xap X EG? k=l (iisi JEPC,2,...58) 
X4, ‘ie(1,2,.. „s-1} 


[Axpe D 
m,(A)m, (A)...m, (4)m, (X,)...m, (X s) 


T+ 


m, (4)+m,(A)+..4m, (Atm, CX, J+.. +m, (X) 





where P(1, 2, ..., s) is the set of all permutations of the elements {1, 2, ..., S}. 
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It should be observed that X7, X2, ..., Xs-; may be different from each other, or some of them 
equal and others different, etc. 


We wrote this PCR6 general formula in the style of PCR5, different from Arnaud Martin & 
Christophe Oswald’s notations, but actually doing the same thing. In order not to complicate the 
formula of PCR6, we did not use more summations or products after the third Sigma. 


As a particular case: 
penne > > | m, (A) +... +m, (A) |m, (A)...1m,, (A), (X))...m, (X3) 
a E X,,X,66° FAA (ipini )EP(1,2,3) m, (A) +... +m, (4)+ M (X) Ft m, (X,) 
X,#A,X74#A 
XN, 4=0 
where P(1,2,3) is the set of permutations of the elements {1,2,3}. 
It should also be observed that X, may be different from or equal to X, . 


Conclusion. 


The aim of this paper was to show how to manually compute PCRS for 3 sources on some 
examples, thus better understanding its essence. And also how to take into consideration the 
importance of sources doing the Repeated Fusion Method. We did not present the Method of 
Discounting to the Empty Set in order to emphasize the importance of sources, which is better 
than the first one, since the second method was the main topic of paper [2]. 








We also presented the PCR5 formula for 3 sources (a particular case when n=3), and the general 
formula for PCR6 in a different way but yet equivalent to Martin-Oswald’s PCR6 formula. 
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Abstract 


This paper proposes a new solution for reducing the number of sources 
of evidence to be combined in order to diminish the complexity of the fusion 
process required in some applications where the real-time constraint and 
strong computing resource limitation are of prime importance. The basic 
idea consists in selecting, among the whole set of sources of evidence, only the 
biggest subset of sources which are not too contradicting based on a criterion 
of Evidence Supporting Measure of Similarity (ESMS) in order to process 
solely the coherent information received. The ESMS criterion serves actually 
as a generic tool for outlier source identification and rejection. Since the 
ESMS between several belief functions can be defined using several distance 
measures, we browse the most common ones in this paper and we describe 
in detail the principle of our Generalized Fusion Machine (GFM). The last 
part of the paper shows the improvement of the performances of this new 
approach with respect to the classical one in a real-data based and real-time 
experiment for robot perception using sonar sensors. 


Key words: 
Information fusion; Belief function; Complexity reduction; Robot 
perception; DSmT; Measure of similarity; Distance; Lattice. 


53 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


1. Introduction 


Information fusion (IF) has gained more and more interest in the scien- 
tific community since the end of nineties because of the development of so- 
phisticated multisensor and hybrid (involving human feedbacks in the loop) 
systems in many fields of applications (robotics, defense, security, medicine, 
etc.). IF appears through many scientific international conferences and work- 
shops [11]. The main theories useful for information fusion are the Proba- 
bility theory [16, 25] (and more recently the Imprecise Probability Theory 
[38]), the Possibility Theory [8] (based on Fuzzy Sets theory [42]), Neutro- 
sophic Set Theory [15] and belief function theories, mainly Dempster-Shafer 
theory (DST) [29] and more recently Dezert-Smarandache theory (DSmT) 
[31, 32, 33]. 


In this work, we concentrate our attention on belief functions theories and 
specially on DSmT because of its ability to deal efficiently with uncertain, 
imprecise and conflicting quantitative and qualitative information. Basically, 
in DST, a basic belief assignment (bba) m(.) is a mapping from the power 
set 2° (see section 2.2 for details) of the frame of discernment © into [0, 1] 
such that 

m(0) =0 and ` m(X)=1. 
XE2° 
In DST, © represents the set of exclusive and exhaustive possibilities for the 
solution of the problem under consideration. In DSmT, © can be a set of 
possible non exclusive elements and the definition of bba is extended to the 
lattice structures of hyper-power set DÌ, and to super-power set S° in UFT 
(Unification of Fusion Theories) [30, 32], Chap. 8 - see also section 2.3 for a 
brief presentation and [7, 10] and [33] for definitions, details and examples. 
In general m(.) is not a measure of probability, except in the case when its 
focal elements (i.e. the elements which have a strictly positive mass of be- 
lief) are singletons; in such case, m(.) is called a Bayesian bba [29] which 
can be considered as a subjective probability measure. In belief function 
theories, the main information fusion problem consists in finding an efficient 
way for combining several sources of evidence s1, S2, ..., Sn characterized 
by their bba’s mj(.), ™mo(.), ..., Mn(.) assumed for simplicity here defined 
on the same fusion space, either 2°, DÈ, or S° depending on the underlying 
model associated with the nature of the frame ©. The difficulty in infor- 
mation fusion arises from the fact that the sources can be conflicting (i.e. 
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one source commits some belief in a proposition A whereas another source 
commits some belief in a proposition B but A and B are known to be truly 
exclusive (AM B = @)) and one needs a solution for dealing with conflict- 
ing information in the fusion process. In DST, Shafer proposes Dempster’s 
rule of combination as the fusion operator for combining sources of evidence 
whereas in DSmT the recommended fusion operator is the PCR5 (Propor- 
tional Conflict redistribution rule # 5) rule of combination, see [29] and [33] 
for discussions and comparisons of these rules. PCR5 is more complex than 
Demspter’s rule but it offers a better ability to deal with conflicting infor- 
mation. 


Both rules however become intractable in some applications having only 
low computational capacities (as in some autonomous onboard systems by 
example) because their complexity increases drastically with the number n 
of sources to combine and/or with the size of the frame O, specially in the 
worst case (i.e. when a strict positive mass of belief is committed to all ele- 
ments of the fusion space). To circumvent this problem, one has to play on 
both sides: 1) reducing the number of sources to combine and 2) reducing 
the size of the frame ©. In this paper, we propose a solution only for re- 
ducing the number of sources to combine because we are not concerned in 
our application of robot perception by the second aspect since in this ap- 
plication our frame © has only two elements representing the emptiness or 
occupancy states of the grid cells of the perceived map of the environment. 
To expect good performances of such limited-resource fusion scheme, it seems 
natural to search and combine altogether only the sources which are coher- 
ent (which are not too conflicting) according to a given measure of similarity. 


Such idea has been already investigated by several authors who have 
proposed some distance measures between two evidential sources in different 
fields of applications. For example, Tessem [35] in 1993 proposed the distance 
dij = max,co|BetP,(0,) — BetP;(0,)|) according to the pignistic probability 
transform BetP(.). In 1997, Bauer [1] introduced two other measures of error 
to take a decision based on pignistic probability distribution after approxi- 
mation. In 1998, Zouhal and Denoeux [43] also introduced a distance based 
on mean square error between pignistic probability. In 1999, Petit-Renaud 
[26] has defined a measure directly on the power set of © and proposed an 
error criterion between two belief structures based on the generalized Haus- 
dorff distance. In 2001, Jousselme et al. [14] proposed in DST framework 
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a new distance measure dj; = 1 — avi +m — 2({m1, M2) between two 
basic belief assignments (bba’s) for measuring their similarity (closeness). 
In 2006, Ristic and Smets [27, 28] have defined in the TBM (Transferable 
Belief Model) framework a TBM-distance between bba’s to solve the associ- 
ation of uncertain combat ID declarations. These authors recall also the 


Bhattacharya distance dj; = ,/1— DIAGF) 2 BeF, y mi(A)m;(B) between 


two bba’s. In 2006 also, Diaz et al. [6] proposed a new measure of simi- 
larity between bba’s based on Tversky’s similarity measure [37]. Note that 
in belief function theories, the direct use of classical measures used in Proba- 
bility theory (say like Kullback Leibler (KL) distance [3]) cannot be applied 
directly because bba’s are not probability measures in general. 


In this paper, we develop an Evidence Support Measure of Similarity 
(ESMS) in a generalized fusion space according to different lattices [7, 10] 
for reducing the number of sources of evidence to combine and thus reducing 
the complexity of the computational burden. As shown in the next sections, 
we propose several possible measures of distance for ESMS and we compare 
their performances in our specific application of mobile robot perception. 
The purpose of this paper is not to select, nor to justify, the best measure 
of distance for ESMS but only to show the practical advantage of using the 
ESMS criteria as a generic tool for reducing the complexity of the fusion with 
keeping good performances for our application. 


This paper is organized as follows. In section 2, we briefly recall the 
main paradigms for dealing with uncertain information. In section 3, we 
give a general mathematical definition of ESMS between two basic belief 
assignments and we establish some basic properties of ESMS. In section 4, 
we extend and present different possible ESMS functions (distance measures) 
fitting with the different mathematical paradigms listed in section 2. A 
comparison of the performances of five possible distances is made through 
a simple example in section 5. The simulation presented in section 6 shows 
in details how ESMS filter is used within GFM scheme. An application 
of ESMS filter in GFM for mobile robot perception with real-data (sonar 
sensors measurements) and in real-time is presented in section 7 to show the 
advantages of the approach proposed here. The conclusion is given in section 
8. 
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2. The main paradigms for dealing with uncertainties 


2.1. Probability Theory and Bayes’ rule. 

The (axiomatic) Probability Theory [16] is the most achieved theory for 
dealing with randomness. We will not present this theory in details since 
there exist dozens of very good classical books devoted to it, see for exam- 
ple [25]. We just recall that a random experiment is an experiment (action) 
whose result is uncertain before it is performed and a trial is a single per- 
formance of the random experiment. An outcome is the result of a trial and 
the sample space © is the set of all possible outcome of the random experi- 
ment. An event is the subset of the sample space © to which a probability 
measure can be assigned. Two events A; and A; are said exclusive (disjoint) 
if A4; N A; = 0, Vi Æ j, where the empty set Ø represents the impossible 
event. The sure event is the sample space O. The probability theory is based 
on Set Theory and the measure theory on sets. The following axioms have 
been identified as necessary and sufficient for probability P(.) as a measure: 
Axiom 1) (nonnegativity) 0 < P(A) < 1, Axiom 2) (unity) P(O) = 1, and 
Axiom 3) (finite additivity’), if A,, Az, ..., An are disjoint events, then 
P(A, UAgU...U An) = X; P(Ai). Events which are subsets of the sam- 
ple space are put in one-to-one correspondence with propositions in belief 
fuction theory [29], pages 35-37 and that’s why we use indifferently the ter- 
minology set, event or proposition in this paper. The probabilistic inference 
is (usually) carried out by Bayes’ rule according to: 

P(A; B) P(BIA;)P(Ai) 

VEE U E Seale Cay” 

where the sample space © has been partitioned into exhaustive and ex- 

clusive events Aj, Á2,..., An, ie. such that A; N A; = Ø, (i A j) and 
A U A&U... U An = O; P(.) is an a priori probability measure defined 
on O satisfying Kolmogorov’s axioms. In Bayes formula, it is assumed that 
the denominator is strictly positive. A generalization of this rule has been 
proposed by Jeffrey [12, 13] for working in circumstances where the parochial- 
ist assumption is not a reasonable assumption, i.e. when P(B|B) = lisa 
fallacy, see [13, 22] for details and examples. 


1 Another axiom related to the countable additivity can be also considered as the fourth 
axiom of the probability theory. 
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Using the classical terminology adopted in belief function theories (DST 
and/or DSmT) and considering for example © = {A, B}, a discrete proba- 
bility measure P(-) can be interpreted as a specific Bayesian belief mass m(.) 
such that 

m(A) + m(B) =1 (2) 


2.2. Dempster-Shafer Theory (DST) 

In DST [29], the frame of discernment © of the fusion problem under 
consideration consists in a discrete finite set of n exhaustive and exclusive 
elementary hypotheses 0;, i.e. © = {01,02,...,0,}. This is called Shafer’s 
model of the problem. Such model assumes that an ultimate refinement of 
the problem is possible, exists and is achievable, so that elements 6;, i = 
1,2,...,n are well precisely defined and identified in such a way that we are 
sure that they are truly exclusive and exhaustive (closed-world assumption). 
The set of all subsets of © is called the power set of © and is denoted 2°. 
Its cardinality is 2!°!. Since 2° is closed under U and all 6;, i = 1,2,...,n 
are exclusive, it defines a Boolean algebra. All composite propositions built 
from elements of © with U operator such that: 


1) Wi 0i,- 0n E 29. 

2) If A, B € 29, then AU B € 29; 

3) No other elements belong to 29, except those obtained by using rules 
1) or 2). 


Shafer defines a basic belief assignment (bba), also called mass function, as 
a mapping m/(.) : 2° — [0,1] satisfying m(@) = 0 and the normalization 
condition. Typically, when © = {A,B} and Shafer’s model holds, in DST 
one works with m(.) such that 


m(A)+m(B)+m(AU B)=1 (3) 


m(A U B) allows us to commit some belief on the disjunction A U B which 
represents the ignorance in choosing between A and B. From this very sim- 
ple example, one sees clearly the ability of DST to offer a better modeling 
for a total ignorant/vacuous source of information by setting m(A U B) = 1, 
whereas in Probability Theory one would be forced to adopt the principle of 
insufficient reason (as known also as the principle of indifference) to justify 
taking m(A) = m(B) = 1/2 as default belief mass for representing a total 
ignorant body of evidence. 
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In DST framework, the combination of two belief assignments mj (.) and 
mM (.) is done using Dempster’s rule of combination which can be seen as 
the normalized version of the conjunctive rule in order to remove the total 
conflicting mass and to get a proper normalized belief mass after the combi- 
nation [29]. Dempster’s rule is mathematically defined by m(@) = 0 and for 
X #0 by 


` m1(X1)m2(X2) 
m(X) =X (4) 
1— ` m,(X1)m2(X2) 
X1,X2€22 
XiNX2=0 
Dempster’s formula is defined if and only if the two sources of evidence 


are not fully conflicting; that is when }/y, x,e20 71(X1)m2(X2) # 1. 
X1NXo=0 


2.3. Dezert-Smarandache Theory (DSmT) 


In DSmT framework [31, 32, 33], the frame O = {0;, 6,...,0,} is a finite 
set of n exhaustive elements which are not necessary exclusive. The prin- 
ciple of the third excluded middle and Shafer’s model are refuted in DSmT 
(but can be introduced if needed depending on the model of the frame one 
wants to deal with), since for a wide class of fusion problems, the nature of 
hypotheses can be only vague and imprecise or crude approximation of the 
reality and none ultimate refinement is achievable. As a simple example, if 
we consider two suspects Peter (P) and Mary (M) in some criminal investi- 
gations, it may be possible that Peter has committed the crime alone, as well 
as Mary, or maybe Peter and Mary have committed the crime together. In 
that case, one has to consider the possibility for PM M Æ Ø but there is no 
way to refine the original frame © = {P, M} into a finer one with exclusive 
finer elements say as O’ = {P\ (PNM), POM, M \ (PA M)} because there 
is no physical meaning and no possible occurrence of the atomic granules 
P\ (PAM) and M \ (PNM). In other words, the finer exclusive elements 
of the refined frame satisfying Shafer’s model cannot always be well identi- 
fied and precisely separated and they may have no sense at all. This is the 
main reason why DSmT allows as foundation the possibility to deal with non 
exclusive, partially overlapped or vague elements and refute Shafer’s model 
and third excluded middle assumptions. DSmT proposes to work on a fusion 
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space defined by Dedekind’s lattice also called hyper-power set DÌ in DSmT. 


The hyper-power set is defined as the set of all composite propositions 
built from elements of © with N and U operators such that [5]: 


1) 0, 0i, ..., 0n E€ DS; 
2) If A,B € D®, then AU B € D? and AN B € DO; 


3) No other elements belong to D®, except those obtained by using rules 
1) or 2). 


Following Shafer’s idea, Dezert and Smarandache define a (generalized) basic 
belief assignment (or mass) as a mapping m/(.) : DE — [0,1] such that: 


m(0) =0 and ` m(X)=1. 


XED® 


Typically, when © = {A, B} and Shafer’s model doesn’t hold, in DSmT 
one works with m(.) such that 


m(A)+m(B)+m(AU B)+m(An B)=1 (5) 


which appears actually as a direct and natural mathematical extension of (2) 


and (3). 


Actually DSmT offers also the advantage to work with Shafer’s model 
or with any hybrid model if some integrity constraints between elements of 
the frame are known to be true and must be taken into account in the fu- 
sion process. DSmT allows to solve static and/or dynamic? fusion problems 
in the same general mathematical framework. For notation convenience, one 
denotes by G® the generalized fusion space or generalized power set including 
integrity constraints (i.e. exclusivity as well as possible non-existence restric- 
tions between some elements of ©), so that G° = DÈ when no constraint 
enters in the model, or G° = 2° when one wants to work with Shafer’s 
model (see [31] for details and examples), or G2 = © when working with 
probability model. If one wants to work with the space closed under union 
U, intersection N, and complementarity C operators, then G? = S®, i.e. the 
super-power set (see next section). A more general introduction of DSmT 


?i.e. when the frame and/or its model change with time. 
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can be found in Chapter 1 of [33]. 


In DSmT, the fusion of two sources of evidences characterized by mj(.) 
and mz(.) is defined by mpors(0) = 0 and VX € G® \ {0} 


mi (X mY) ma( X} m (Y) 


mpecælX)=m(X)+ *` Oe Fil) ED)! (6) 


YeGe 
XnY=0 
where all sets involved in formulas are in canonical form; m12(X) = ma(X) = 


J xı xece Mı(Xı)M2(X2) corresponds to the conjunctive consensus on X 
XıNX2 =X 
between the n = 2 sources and where all denominators are different from 


zero. If a denominator is zero, that fraction is discarded. A general formula 
of PCR5 for the fusion of n > 2 sources has been proposed in [32]. 


2.4. Unification of Fusion Theory (UFT) 

Recently Smarandache has proposed in [30, 32] an extension of DSmT by 
considering a super-power set S° as the Boolean algebra on O, i.e. S° = 
(9,N,U,c(.)). In other words, S° is assumed to be closed under union U, 
intersection N, and complement c(.) of sets respectively. With respect to the 
partial ordering relation, the inclusion C, the minimum element is the empty 
set Ø, and the maximal element is the total ignorance I = LJ", 6;. Since it 
extends the power set space through the closed operation of N, U and c(.) 
operators, that is, UFT not only considers the non-exclusive situation among 
the elements, but also consider the exclusive, exhaustive, non-exhaustive 
situations, and even open and closed world. Typically, when © = {A, B}, 
in UFT one works with m(.) such that 


m(A) + m(B)+m(AN B) + m(AU B) 
+m(c(A)) + m(c(B)) + m(c(A) Uc(B)) =1 (7) 


3. Evidence Support Measure of Similarity (ESMS) 


Definition 3.1. Let’s consider a discrete and finite frame © and the fusion 
space GÈ including integrity constraints of the model associated with ©. The 
infinite set of basic belief assignments defined on GÈ is denoted by mge. An 
Evidence Support Measure of Similarity (ESMS) of two (generalized) basic 
belief assignments m,(.) and mo(.) in mge is the functionSim(.,.) : mge x 
mgo — [0,1] satisfying the following conditions: 
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1) Symmetry: Ymı(.), Mm2(.) E mge, Sim(m 1, m2) = Sim(mz, mı); 
2) Consistency: Vm(.) E mge, Sim(m,m) = 1; 
3) Non-negativity: Vmy(.),me(.) E mge, Sim(m 1, m2) > 0 


We will say that mo(.) is more similar to mi(.) than m3(.) if and only if 
Sim(m1, m2) > Sim(m,,m3). The maximum degree of similarity is naturally 
obtained when both bba’s mi(.) and mg(.) coincide, which is expressed by 
consistency condition 2). The equality Sim(m,,mz) = 0 must be obtained 
when bba’s have no focal elements in common, in particular whenever m (.) is 
focused on X € G®, which is denoted m+ (.) and corresponds to m1(X) = 1, 
and mo(.) is focused on Y € G®, i.e. me(.) = m}(.) such that m2(Y) = 1, 
with XN Y =Í. 


Theorem 3.1. For any bba mı(.) € mge (which is a |G®|-dimensional 
vector) and any small positive real number €, there exists at least one bba 
M2(.) E mgo for a given distance measure? d(.,.) such that d(m,,mz) < €. 


Proof: Let’s take mo(.) = m(.), then d(m1, m2) = d(mi, m1) = d(m2, m2) = 
0 < e which completes the proof. 


Definition 3.2. (Agreement of evidence) : If there exist two basic belief 
assignments mı(.) and mo2(.) in mge such that for some distance measure 
d(.,.), one has d(mı, mo) < € with € > 0, then € is called the agreement 
of evidence supporting measure between m,(.) and m2(.) with respect to the 
chosen distance d(.,.). mı(.) and mo(.) are said €-consistent with respect to 
the distance d(.,.). 


Theorem 3.2. The smaller € > 0 is, the closer the distance d(m,, mz) be- 
tween m,(.) and ma(.) is, that is, the more similar or consistent mı(.) and 
M2(.) are. 


Proof: According to the Definition 3.2, if the evidence measure between 
mı(.) and mo(.) is e-consistent, then d(mi,m2) < €. Let’s take € = 1 — 
Sim(m1, mz); when € becomes smaller and smaller, Sim(m1,mz2) becomes 
greater and greater, according to the definition of ESMS and thus more simi- 
lar or consistent m (.) and m2(.) become. Finally, if € = 1—Sim(m,, m2) = 0, 


3Here we don’t specify the distance measure and keep it only as a generic distance. Ac- 
tually d(.,.) can be any distance measure. In practice, the Euclidean distance is frequently 
used. 
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then mj (.) and mo(.) are totally consistent. 


From the previous definitions and theorems, the ESMS appears as an in- 
teresting measure for evaluating the degree of similarity between two sources. 
We propose to use ESMS in a pre-processing/thresholding technique in order 
to reduce the complexity of the combination of sources of evidence by keeping 
in the fusion process only the sources which are e-consistent. e is actually 
a threshold parameter which has to be tuned by the system designer and 
which depends on the application and computational resources. 


4. Several possible ESMS 


In this section we propose several possibilities for choosing an ESMS 
function Sim/(.,.) satisfying theorem 3.1. 


4.1. Euclidean ESMS function Simg(mı, mə) 

Definition 4.1. Let © = {01,...,0n} (n > 1), mi(.) and mo(.) in mee, 
X; the i-th (generic) element of GE and |G®| the cardinality of G°. The 
following simple Euclidean ESMS function can be extended from [14]: 





1 IG®| j 
Simg(mi, m2) = 1 — ao 2 mi (Xi) — mo(X;)) (8) 


The following theorem establishes that Simg(mı, Mz) is an ESMS func- 
tion. 


Theorem 4.1. Simg(mı, m2) defined in (8) is an ESMS function. 
Proof: 

1) Let’s prove that orme one € [0,1]. If Simg(mı, m2) > 1, from 
(8) one would get ZV En | (mı(X;:) — m2(X;))? < 0 which is impos- 
sible, so that Simm, m) < 1. Let’s prove Simg(mı, m2) > 0 or 
equivalently from (8), y | (m1(X;) — m(X:))? < 2. This inequality 


is equivalent to Sefl m |m (X;)2 + SET mX i? <2+ 2 elms (X;) 
M2(X;). We denote it (i) for short. (i) always holds because one has 


SEIMAS EEE a X)2) < (DS a KP +! eX) 


and thus 
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(etl a (Xi)? + DIST! ma(Xi)?) < 2 because [IE]! m,(Xi)]? = 1 for 
s = 1,2 (m,(.) being normalized bba). Therefore inequality (i) holds 
and hus Simg(mı, Mma) > 0. 

2) It is easy to check that Simg(mj,, mz) satisfies the first condition of 
Definition 3.1. 

3) If mi(.) = mo(.), then Simg(mı, m2) = 1 because 


IG?| 


NO (m (X) — m2(Xj))? = 0. 


i=1 


Thus the second condition of Definition 3.1 is also satisfied. 
4) Non-negativity has been proven above in the first part. Herein we use 
a particular case to show that Sim(m1, m2)=0, i.e. there exist m and 
m% for some X,Y € G® \ {Ø} such that X 4 Y, then according to (8), 
one gots SIS" am (X) —ma(X.))? = fmf (XD? + mY Y )P = 2 and 
thus one has Simg(m*,m¥) = 1 — (V2/ V2) = 0, so that Sima(.,.) 
verifies the third condition of Definition 3.1. 


4.2. Jousselme ESMS function Sim (m1, m2) 


Definition 4.2. Let mi(.) and mo(.) be two basic belief assignments in mge 
provided by the sources of evidence Sı and Sy. Given a |G®°| x |G®°| assumed* 
positive definite matrix D = [D,;|, where Dij = |X; X;|/|Xi U X;|, with 
Xi, X; € G°. Then, Jousselme ESMS function can be redefined from the 
Jousselme et al. measure [14]: 


SG a) Gn (9) 


V2 


or equivalently 


Sim (m1, Ma) = | — — m? + m3 — 2(m1, m2) 


V2 


4 Actually, Jousselme et al. in [14] did not prove that D = [Dy = |X; N Xj|/|X;U Xj] 
is truly a positive definite matrix. D is until now assumed to be positive definite. This is 
only a conjecture and proving it is not a trivial problem. 
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where (m1, M2) is the scalar product defined as 


IG9||G?| 


(mi, Mm) = D ` Dijmı(Xi)ma(X;) 


i=1 j=l 


Xi, X; € G9, i,j =1,...,8,|G®|; |/m||? represents the squared norm of the 
vector (bba) m, i.e. ||m||? = (m, m). 


Theorem 4.2. Sim (m1, mz) defined in formula (9) is an ESMS function. 


Proof: 


1) Since the matrix D is conjectured to be a positive definite matrix, 
Sim (m1, M2) satisfies the condition of symmetry. 

2) If mı is equal to ms, according to (9), one gets Sim (m,,m2) = 1. In 
other hand, if Sims(m1,mz2) = 1, then the condition mı = mz holds. 
That is, the condition of consistency is satisfied. 

3) According to (9), it can be drawn that Sim (m1, m2) < Sime(mi, mə), 
and since the minimum value of Sim. s(m4, m2) is zero, then Sim (m1, m2) 
is non-negative. 

4) According to the definition of Sim (m1, mz), we can easily verify that 
Sim (m1, Mo) is a true distance measure between m; and mo. 


Actually Simg(mı, M2) is nothing but a special case of Sim s(m1, mə) 
when taking D as the |G°| x |G®| identity matrix. 


4.3. Ordered ESMS function Simo(m1, ma) 


The definition of this (partial) ordered-based ESMS function is similar 
to Sim s(m1,mz) but instead of using Jousselme’s matrix D = [Dj;;], where 
Diz = |X: N X;|/|X; U X;|, with X;,X; € G9, we choose the DSm matrix 
S = [5;;] where Si; = s(X; X;)/s(X;U X;). Therefore, one has 


1 
V2 
The function s(X) corresponds to the intrinsic informational content of 
the proposition X defined in details in [31] (Chap. 3) which is used for 


Simo(m4, m2) = 1 — (mi — mM2)7S(m, — m2) (10) 
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partially ordering the elements of G®. More precisely, s(X) is the sum of the 
inverse of the length of the components of Smarandache’s code? of X. 

As a simple example, let’s take © = {0,62} with free DSm model (i.e 
i.e. when all elements are non-exclusive two by two), then the partially® 
ordered hyper-power set G® is given by GP = {0,01 N 62,01, 02,01 U 02} 
because s(@) = 0, s(0: 62) = 1/2, s(@:) = 1 + 1/2, s(02) = 1+ 1/2 and 
s(0, U 02) = 1 + 1 + 1/2 since Smarandache’s codes of Ø, 61, 02, 01 N 42 and 
0, U 02 are respectively given by {< . >} (empty code), {< 1 >,< 12 >}, 
{< 2 >,< 12 >}, {< 12 >} and {< 1 >,< 12 >,< 2 >}. The matrix S is 
defined by” 


s(81N02) s(01N02) s(81N02) s(01N02) 
































C E E Eee 
s =- CA 5(01)_ so) Ue) | _ [1/38 1 1/5 3/5 

we a at) a 5 1/3 1/5 1 3/5 

s s(01U s s(01U 

s(01M02) (n). s(02) — s(01UX2) 1/5 3/5 3/5 1 











s(01U02) s(01U02) s(01U02) s(01U82) 


It is easy to verify the positiveness of the matrix S by checking the positivity 
of all its eigenvalues which are A; = 0.800 > 0, Ag ~ 0.835 > 0, A3 & 0.205 > 
0 and Ay & 2.160 > 0. We have verified the positiveness of matrix S for 
Card(©) = n < 5. Since a general proof of the positiveness of D and S 
seems difficult to obtain, we can only make a conjecture on the positiveness 
of S presently. 


4.4. ESMS function Simp(mı, mə) 
Another ESMS function based on Bhattacharya’s distance is defined as 
follows: 


Definition 4.3. Let m,(.), m2(.) be two basic belief assignments in mge, the 
ESMS function Simg(mı, mz) is defined by: 


>Smarandache code is a representation of disjoint parts of the Venn diagram of the 
frame © under consideration. This code depends of the model for ©. For example, let’s 
take © = {01,02}. If 01 N 62 = Ø (Shafer’s model) is assumed, then the code of 6, is 
< 1 >, whereas if 01 N 62 4 @ (free DSm model) is assumed, then the code of 6; will be 
{< 1 >,< 12>}. The length of a component of a code is the number of characters between 
< and > in Smarandache’s notation. For example, the length of component < 12 > is 2. 
See [31], pp. 42-43 for details. 

°This is a partial order since s(01) = s(62). 

T Actually, one works with G® \ {Ø}, and thus the column and row corresponding to the 
empty set do not enter in the definition of S. 
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Simp(m, m2) =1- |1- S Vm (Xi)m(X:) (11) 
XEF 


where F is the core of sources Sı and Sa , i.e. the set of elements of GS 
having a positive belief mass: F = {X € G°|m,(X) > 0 or m(X) > 0}. 


Theorem 4.3. Simg(mı, M2) defined in formula (11) is an ESMS function. 


Proof: 


1) Since ie Vm, (X;)mM2(X;) = ` V/M2(X;)m1(X;) then Simp(m1, m2) 


XCF XiEF 
satisfies the condition of symmetry. 


2) if mı(.) = ma(.), according to (11), 


NO Vim (Xi)m2(Xi) = X mi (Xi) = 1 


XCF XiCF 


and therefore Simpg(mı, mı) = 1. In other hand, if Simg(m,mz2) = 1, 
then the condition mı(.) = mə(.) holds. That is, the condition of 
consistency is satisfied. 
3) From the definition of bba, ` mı(Xı) = 1. Therefore, 
XEF 


XO Vm (Xi)m(X;) € [0,1]. 


XCF 


According to (11), it can be drawn that Simg(m, m2) € [0, 1]; that is, 
the minimum value of Simg(m 1, mz) is zero. Therefore, Simp(m1, Ma) 
is a non-negative measure. 

4) According to the definition of Simpg(mı, m2), we can easily verify that 
Simp(mi1, mg) is a true distance measure between m and mo. 


5. Comparison of ESMS functions 


In this section we analyze the performances of the five ESMS functions 
aforementioned through a very simple example, where © = {01, 02,03}. We 
assume the free DSm model for ©. In such case, GÊ = D® has eighteen 
non-empty elements a;, i = 1,2,...,s,...,18. G® is closed under N and U 
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operators according to Dedekind’s lattice. Of course, we also may choose 
other theoretical frameworks in a similar way according to the appropriate 
model of G®. 


Let’s assume that 92 is the true identity of the object under considera- 
tion. Its optimal belief assignment is denoted m2(.) = {m2(02) = 1,m2(X) = 
0 for X € G® \ {62}}. We perform a comparison of the four ESMS functions 
in order to show the evolution of the measure of similarity between mj (.) and 
Mo(.) when m4(.) is varying from an uniform distributed bba to mə(.). More 
precisely, we start our simulation by choosing m;(.) with all elements in DÈ 
uniformly distributed, i.e. mı(a;) = 1/18, for i = 1,2,...,8,...,18. Then, 
step by step we increase the mass of belief of 02 by a constant increment 
A = 0.01 until reaching mı(02) = 1. In the meantime the mass mı(X) of 
belief of all elements X 4 02 of G? take value [1 — m,(2)|/17 in order to 
work with a normalized bba m;(). The basic belief mass committed to empty 
set is always zero, i.e. m1(0) = mo(M) = 0. 


The degree of similarity of the four ESMS functions are plotted in Figure 
1. The speed of convergence? of a similarity measure is characterized by the 
angle a of the slope of the curve at origin, or by its tangent. Based on this 
speed of convergence criterion, the analysis of the figure 1 yields the following 
remarks: 


1) According to Figure 1., tan(ag) ~ 0.86, tan(ag) 0.68, tan(ag) ~% 0.6 
and tan(a,;) © 0.57. Sim (m4, m2) has the slowest convergence speed, 
then Simo(mj,, m2) takes second place. 

2) Simg(mı, m2) has a faster speed of convergence than Simo(m, mz) 
and Sim (mj 1, mz) because it doesn’t consider the intrinsic complexity 
of the elements in G®. 

3) The speed of convergence of Simp(my1, mz) is the fastest. When m,(.) 
and m2(.) become very similar, Simpg(mı, M2) becomes very quickly 
close to 1. But if a small dissimilarity between mj () and ma() occurs, 
then Simg(mı, M2) becomes very small which actually makes it very 
sensitive to small dissimilarity perturbations. 


’Here the convergence speed refers to how much the global agreement degree (similar- 
ity) is between m,(.) and ma(.) with the continuous decrease of mı (62). 
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Figure 1: Comparison of performance among ESMS functions. 


4) In summary, one sees that no definitive conclusion about the best choice 
among these four ESMS functions can be drawn in general, but if one 
considers as important the speed of convergence criterion of the dis- 
similarity/ difference between two evidential sources, Sim pg(m1, Mə) is 
the best choice, because it is very sensitive to such difference, whereas 
Sim (m1, M2) is the worst choice with respect to such criterion. 


6. Simulation results 


We present a simulation result to show how the ESMS filter performs 
in generalized fusion machine (GFM), and its advantage. Let’s take a 2D 
frame of discernment © = {01,02} and consider twenty equireliable sources 
of evidence according to the Table 1. We consider the free DSm model 
and the fusion space is the hyper-power set DO = {6), 02,01 N 02,01 U 02}. 
S.; denotes the barycentre of the front 10 belief masses, while Se2 denotes 
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the barycentre® of all belief masses . The measure of similarity based on 
Euclidean ESMS function defined in equation (8) has been used here, but 
any other measures of similarity could be used instead. In this example if 
we take 0.75 for the threshold value we see from the Table 1 and for the 10 
front sources of evidences, that the measures of similarity of S5 and Sj) with 
respect to Sa are lower than 0.75. Therefore, the sources Ss and Sio will be 
discarded/filtered of the fusion process. If the threshold value is set to 0.8, 
then the sources S5, S10, S4 and Sg will be discarded. That is, the higher the 
given threshold is, the less the number of information sources through the 
filter is. 


Co [mO [m nan e 

o3 fo f oa f or fow 
Ps; [or [os [01 | 00 [0.5550 
Ps | oa [os | o1 | oo [osm] 
Sof 08 | o1 | o0 | o1 [0.6806 
Su | os o0 | o2 | 03 [0756] 
Sal o [06 [01 | 0.1 0.7205 | 


suf os for f oo f oo poea 


Ss | 05 | o2 | or | 02 [oso] 
Sef 05 | o3 | 00 | o2 | 0.8000 
se o o o | a o 
Ss or | o2 | 01 | oo [ors 
S| or | or | or | or [osar] 
Saf o3 fos P or M oo foro 


Table 1: A list of given sources of evidences. 





°Let’s denote k = |G®| the cardinality of G? and consider S independent sources of 
evidence. If all sources are equireligble. the barycentre of belief masses of the S sources is 
given by: Vj =1,...,k, m(X;) = 3 LES -1 Ms(Xj;), see [18] for details. 


70 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 







Initialize 
s=0, n=O 






nput Evidence 
Source S= S +1 


Yes 


Euclidean ESMS filer 





usion Machine 
(Classical DSm combinational 
rule and PCRS rule) 


S=( 25} 


S, = $, i € [2,-+-10] 
Sio = Si 


Figure 2: The procedure flow chart of DSmT-based generalized fusion machine. 
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In order to embed ESMS function to GFM [19], we give its working 
principle summarized in Figure 2, whose main steps of the algorithm for 
implementing the GFM are enumerated as follows: 


1) 


2) 


5) 


Initialization of the parameters: the number of sources of evidence is 
set to zero, (i.e. one has initially no source, s = 0), so that the number 
of sources in the filter window is n = 0. 

Include!” a source of evidence S, and then test if the number of sources 
s is less than 2. If s > 2, then go to next step, otherwise include/take 
into account another source of evidence Sz. 

Based on the barycentre of gbba of the front n < 10 evidence sources, 
the degrees of similarity are computed according to the formula (8), 
and compared with a prior tuned threshold. If it is larger than the 
threshold, then let n = n + 1. Otherwise, introduce a new source of 
evidence mae 

If n = 1, the current source, say S, is not involved in the fusion process. 
If n = 2, then the fusion step must apply between S and Ŝa with 
classical DSm rule [31], i.e. the conjunctive consensus. We then use 
PCR5 rule [32] to redistribute the remaining partial conflicts only to 
the sets involved in the corresponding partial conflicts. We get a new 
combined source affected with same index S. If 2 < n < 10, after the 
current evidence source S, is combined with the final source of evidence 
produced last time, a new source of evidence is obtained and assigned 
to S again. Whenever n < 10, go back to step 2), otherwise, the current 
source of evidence S, under test has been accepted by the ESMS filter, 
S; is assigned to 9;_1,i € (2, s, 10], and S, is assigned to Sio. Then, Sio 
is combined with the last source S, the combined result is reassigned!’ 
to S, and then, go back to step 2). 

Test whether to stop or not!?: if no, then introduce a new source of 
evidence een otherwise stop and exit. 


We show two simulation results in Figures 3 and 4 following the working 
principle of GFM, when we use the sources of evidences listed in Table 1. 


10We assume the free DSm model and consider that the general basic belief assignments 
are given. 

11Tn this work, we use also an ESMS filter window in a sliding mode. 

12Īn our experiment, judge whether the mobile robot stops receiving sonar’s data. 
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The comparison of Figures 3 and 4 yields the following remarks: 


1) 


13 


On the Figure 3, we don’t see the real advantage of ESMS filter since 
the convergence to 0, without ESMS filter (green curve) is better than 
with ESMS filter in terms of improving fusion precision. This is because 
some useful information sources are filtered and thrown away with the 
increase of the threshold, the number of information sources to enter 
into the final fusion will become fewer and fewer. Generally speaking, 
this yields a slower speed of convergence to 6;. For example, for one 
source Sg in the Table 1, if S is combined with itself once accord- 
ing to (6), the combinational result is S = [my(@1), mn (02), my (01 N 
6), Mn (01 U 62)|'? = (0.5707, 0.0993, 0.1680, 0.1620] . If twice, the re- 
sult is S = [0.6765, 0.0852, 0.1429, 0.0954]. If thrice, then the result is 
S = (0.7453, 0.00693, 0.1216, 0.0638]. More the combinational times is, 
nearer by 1 my(0,) is and nearer by 0 my(62) is .Therefore, ESMS 
filter might also result in losing some useful information, while it filter 
some bad information. 

On the Figure 4, one sees the role played by the ESMS filter. When 
there are highly conflicting sources, the result of the fusion process will 
not converge if the ESMS filter is not used. With the fine tuning of the 
ESMS threshold, the convergence becomes better and better because 
ESMS filter processes the fused information, and withdraws the sources 
which might cause the results to be incorrect or imprecise, so that it 
improves the fusion precision and correctness. 

The reduction of computing burden is obtained. Even if we have intro- 
duced an ESMS preprocessing step, it turns out that finally a drastic 
reduction of computing burden is obtained because we can significantly 
reduce the number of sources to combine with ESMS criterion. 

We increase the applicability of the classical rules of combination. Since 
we reduce the number of conflicting sources of evidence thanks to the 
ESMS preprocessing, the degree of conflict between sources to combine 
is kept low. Therefore, classical rules, like Dempster’s rule, which do 
not perform well in high conflicting situations can be used also in appli- 
cations like the one studied here. Without ESMS preprocessing step, 
the classical fusion rules cannot work very well [23, 29]. Therefore, we 
extend their domain of applicability when using ESMS filtering step. 


my(-) refers to the new generalized basic belief assignment 
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Figure 3: Fusion result of the front 10 sources using different thresholds (0 ~ 0.9). 








Threhold=0.9 
0.9L TA J 
+ m° (0)) 
C 
& ost Threhold=0.9 * om (03) 4 
ox < m° (0,78,) 
C 

= 07} m° (0,.8,))) | 4 
v 
E 
& 06 4 
A 
< Threhold=0.6 
3S 0.57 
3 
[2] 
2 044 
a 
S 
[--} 
z 0.3 
v 
=I 
z 
© 0.2F 

0.1 














Figure 4: Fusion result of the total 20 sources using different thresholds (0 ~ 0.9). 
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7. An application in mobile robot perception 


The information acquired in building grid map using sonar sensors on 
a mobile robot is usually uncertain, imprecise and even highly conflicting. 
Such application in autonomous robot perception and navigation provides 
a good platform to verify experimentally the capability of the ESMS filter 
in GFM. Although there exist many methods of building map based either 
on Probability theory [36], FST (Fuzzy System Theory) [24], DST [34], GST 
(Grey System Theory) [39, 40, 41], or DSmT [20], we just compare the perfor- 
mances of the map building using a classical fusion machine without ESMS 
filter (i.e. CFMW) with respect to the classical fusion machine with ESMS 
filter (called GFM) in the DSmT framework only. A detailed comparison 
between our current ESMS-based approach with other methods is given in 
a companion paper in [21] where we show that ESMS-based approach out- 
performs other approaches using almost the same experimental conditions 
and inputs. In order to further reduce the measurement noises, we improve 
our past belief assignment model of sonar sensors in DSmT framework" as 
follows: 


(1 — p/(R—2e)) x (1— 2/2) if 


m(0,) = 0<¢ < w/2 
0 otherwise 
(12) 
—3(p— RP) xà if S 
m(02) = (=EN eye i t <yp<w/2 (13) 


0 otherwise 


14We assume that there are only two focal elements 0; and 6 in the frame of discernment. 
Elements of hyper-power set are 01,02, 01 N02 and 01 U@2. 6; represents the emptiness of a 
given grid cell, 02 represents the occupancy for a given grid cell, 01 N02 means that there is 
some conflict between two sonar measurements for the same grid cell and 6; U02 represents 
the ignorance for a grid cell because of the possible lack of measurement acquisition. 
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1=Q(@=(R=«)/R)*) if 1b SS ae 


m(0,N62) = 0<y<w/2 (14) 
0 otherwise 
tan(2(p — R 1-A) if R<p<R 
PCR ETA an(2(p )) x(1-A) if Rep<Rre (15) 
0 otherwise 
Where, À is given by (see [9] for justification) 
1-(2 2 if O<|y|<w/2 
a f1- Cou if osle sw a 
0 otherwise 


The parameters R, p, €, Rmin, w, and vy in formulas (12)-(16) were defined and 
used in [19, 20, 21]. R is the range measurement. p is the distance between 
the grid cell and sonar’s emitting point. € is the range measurement error. 
Rmin is the minimal range distance of sonar sensors. w is the scattering angle 
of sonar. y is the angle between the line (from the grid cell to sonar emitting 
point) and the sonar’s emitting direction. The following functions C4 and Co 
play an important role in reducing noises in the process of map building. Ci 
function, proposed by Wang in [39, 40, 41], is a constrict function!” for sonar 
measurements defined by: 


0 if pte 
C= A Ë pn SPS Pu (17) 
1 if P < Ph 


Where, p;, and pı, represents the upper and lower limits of valid measure- 
ments. Cy is the constraint function! for sonar’s uncertainty defined as 
follows: 





(eee at p— R> 0.5 e 





0.5€ 
Coa (SEY: it poh 20 be (18) 
0 if |p—R|> 0.5 


15The main idea is that the sonar readings must be discounted according to sonar 
characteristics. 

‘©The main idea consists in committing high belief assignments to sonar readings close 
to the sonar sensor. 
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Where, the product of Cı and Ch is multiplied by the belief assignment 
function, i.e. m(02), m(61 N 02), m(A; U 62) respectively. 


The experiment is performed by running Pioneer IT mobile robot with 16 
sonar detectors in the indoor laboratory environment as shown in Figure 5. 
The environment’s size is 4550mm x 3750mm. The environment is divided 
into 91 x 75 rectangular cells having the same size according to the grid map 
method. The robot starts to move from the location (1 m, 0.6 m), which faces 
towards 0 degree. We take the corner of left bottom as the global coordinate 
origin of the map. Objects/obstacles in rectangular grid map are sketched 
in Figure 6. The processing steps of our intelligent perception and fusion 
system have been implemented with our software Toolbox developed under 
C++ 6.0 and with OpenGL server as a client end. When the robot moves in 
the environment, the server end collects much information (i.e. the location 
of robot, sensors measurements, velocity, etc.) from the mobile robot and its 
sensors onboard. Through the protocol of TCP/IP, the client end can get 
any information from the server end and fuse them. 


Since our environment is small,the robot moves less time or a short dis- 
tance. Then one only considers the self-localization method based on 0- 
NFAM" method [17, 19] with the search from 0 — 69 to 0 + 6g. In order to 
reduce the computation burden, the restricted spreading arithmetic has been 
used. The flow chart of this procedure for this experiment is given in Figure 
7. Its main steps are the following ones: 


1) Initialize the parameters of the robot (location, velocity, etc.). 

2) Acquire 16 sonar measurements, and robot’s location from odometer, 
when the robot is running in the environment. We can calibrate the 
robot’s pose with our 6-NFAM method [17, 19]. Here we set the first 
timer, of which interval is 100 ms. 

3) Compute gbba of the fan-form area detected by each sonar sensor ac- 
cording to the formulas in [20]. 

4) Apply the DSmT-based GFM, that is, adopt Euclidean information 
filter to choose basic consistent sources of evidence according to the 
formula (8). Then combine the consistent sources with the DSm con- 
junctive rule [4, 5, 31] and compute gbbas after combination. Then, 


17One has taken 69 = 5° in our experiment. 
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redistribute partial conflicting masses to the gbbas of sets involved in 
the partial conflict only with the PCR5 rule [32]. 

Compute the belief of occupancy Bel(62) of some grid cells according 
to [31]. Save them into the map matrix and then go to step 6). 
Update the map of the environment (here we set the second timer, 
of whose interval is 100 ms). Generally speaking, more the times of 
scanning map are, more accurate the final map reconstructed is. At 
the same time, also test whether the robot stops receiving the detecting 
data: if yes, then stop fusion and exit, otherwise, go back to step 2). 





Figure 5: The real experimental environment. 
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Figure 6: Global coordinate system for the experiment. 


In this experiment, we obtain the maps built by the GFM before and after 
improving the sonar model as shown in Figure 8 and 9 respectively. In order 
to show the advantage of the ESMS filter in the GFM, we also compare our 
approach with the classical fusion machine which doesn’t use the ESMS filter 
(called CFMW). The maps built by the CFMW before and after improving 
the sonar model are shown in Figures 10 and 11 respectively. Whenever the 
map is built before or after improving the sonar model, one sees that the 
GFM always outperforms the CF MW because one obtains clearer boundary 
outlines and less noises in the map reconstruction. In addition, the ESMS 
information filter coupled with PCR5 fusion rule, allows to reduce drastically 
the computational burden because ESMS filter can filter the outlier-sources. 
With the GFM approach, only the most consistent sources of evidence are 
combined and this allows to reduce the uncertainty in the fusion result and 
to improve the robot perception of the surrounded environment. 
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Figure 7: Flow chart of the map building with the GFM. 
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Figure 9: Map building based on the GFM after improving the sonar model. 
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Figure 11: Map building based on the CFMW after improving the sonar model. 
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8. Conclusions 


In this paper, a general Evidence Supporting Measure of Similarity (ESMS) 
between two basic belief assignments has been proposed. ESMS can be used 
on different fusion spaces (lattice structures) and with different distance mea- 
sures. This approach allows to select the most coherent subset of sources 
of evidence available and to reject outlier-sources which are too conflicting 
with other sources. Hence, a drastic reduction of computational burden is 
possible with keeping good performances which is very attractive for real- 
time applications having limited computing resources. The hybrid of ESMS 
with the sophisticated and efficient PCR5 fusion rule of DSmT, called GFM 
(Generalized Fusion Machine), is specially useful and interesting in robotic 
applications involving real-time perception and navigation systems. The real 
application of GFM for mobile robot perception from sonar sensors presented 
in this work shows clearly a substantial improvement of the fusion result in 
map building/estimation of the surrounded environment. This work shows 
also the crucial role played by the most advanced fusion techniques for ap- 
plications in robotics. 
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Contradiction Measures and Specificity Degrees of Basic Belief Assignments, 


Abstract—In the theory of belief functions, many measures 
of uncertainty have been introduced. However, it is not always 
easy to understand what these measures really try to represent. 
In this paper, we re-interpret some measures of uncertainty in 
the theory of belief functions. We present some interests and 
drawbacks of the existing measures. On these observations, we 
introduce a measure of contradiction. Therefore, we present some 
degrees of non-specificity and Bayesianity of a mass. We propose 
a degree of specificity based on the distance between a mass and 
its most specific associated mass. We also show how to use the 
degree of specificity to measure the specificity of a fusion rule. 
Illustrations on simple examples are given. 

Keywords: Belief function, uncertainty measures, speci- 


ficity, conflict. 


I. INTRODUCTION 


The theory of belief functions was first introduced by [1] 
in order to represent some imprecise probabilities with upper 
and lower probabilities. Then [15] proposed a mathematical 
theory of evidence. 


Let O be a frame of discernment. A basic belief assignment 
(bba) m is the mapping from elements of the powerset 2° onto 


[0, 1] such that: 
SM mx). 
XE2° 


a) 


The axiom m(Ø) = 0 is often used, but not mandatory. A 
focal element X is an element of 2° such that m(X) # 0.. 
The difference of a bba with a probability is the domain of 
definition. A bba is defined on the powerset 2° and not only 
on O. In the powerset, each element is not equivalent in terms 
of precision. Indeed, 6; € O is more precise than 0;U62 € 29, 

In the case of the DSmT introduced in [17], the bba are 
defined on an extension of the powerset: the hyper powerset 
noted DÈ, formed by the closure of © by union and inter- 
section. The problem of signification of each focal element is 
the same as in 2°. For instance, 6; € © is less precise than 
61 N 02 € D®. In the rest of the paper, we will note G® for 
either 2° or D®. 

In order to try to quantify the measure of uncertainty such 
as in the set theory [5] or in the theory of probabilities 
[16], some measures have been proposed and discussed in 


in 
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the theory of belief functions [2], [7], [8], [21]. However, 
the domain of definition of the bba does not allow an ideal 
definition of measure of uncertainty. Moreover, behind the 
term of uncertainty, different notions are hidden. 

In the section II, we present different kinds of measures 
of uncertainty given in the state of art, we discuss them and 
give our definitions of some terms concerning the uncertainty. 
In section III, we introduce a measure of contradiction and 
discuss it. We introduce simple degrees of uncertainty in the 
section IV, and propose a degree of specificity in the section 
V. We show how this degree of specificity can be used to 
measure the specificity of a combination rule. 


II. MEASURES OF UNCERTAINTY ON BELIEF FUNCTIONS 


In the framework of the belief functions, several functions 
(we call them belief functions) are in one to one correspon- 
dence with the bba: bel, pl and q. From these belief functions, 
we can define several measures of uncertainty. Klir in [8] 
distinguishes two kinds of uncertainty: the non-specificity 
and the discord. Hence, we recall hereafter the main belief 
functions, and some non-specificity and discord measures. 


A. Belief functions 


Hence, the credibility and plausibility functions represent 
respectively a minimal and maximal belief. The credibility 
function is given from a bba for all X € G® by: 


bel(X)= X m(¥). (2) 
YCX,Y #0 
The plausibility is given from a bba for all X € G® by: 
pl(x) = 5 m(Y). (3) 


YEGE, Ynx#40 


The commonality function is also another belief function given 


a 


YEG® YOX 


q(X) = m(Y). 4) 


These functions allow an implicit model of imprecise and 
uncertain data. However, these functions are monotonic by 
inclusion: bel and pl are increasing, and q is decreasing. This 
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is the reason why the most of time we use a probability to take 

a decision. The most used projection into probability subspace 

is the pignistic probability transformation introduced by [18] 

and given by: 

IXnyY| 
|Y] 





betP(X) = 


2: 


YeG® yO 


mY), (5) 


where |X| is the cardinality of X, in the case of the DSmT 
that is the number of disjoint elements corresponding in the 
Venn diagram. 

From this probability, we can use the measure of uncertainty 
given in the theory of probabilities such as the Shannon 
entropy [16], but we loose the interest of the belief functions 
and the information given on the subsets of the discernment 
space O. 


B. Non-specificity 

The non-specificity in the classical set theory is the impre- 
cision of the sets. Such as in [14], we define in the theory of 
belief functions, the non-specificity related to vagueness and 
non-specificity. 

Definition The non-specificity in the theory of belief 
functions quantifies how a bba m is imprecise. 

The non-specificity of a subset X is defined by Hartley 
[5] by log,(|X|). This measure was generalized by [2] in the 
theory of belief functions by: 


Ds 


XEG®, XZ#0 


NS(m) = m(X) logs (|X). (6) 


That is a weighted sum of the non-specificity, and the weights 
are given by the basic belief in X. Ramer in [13] has shown 
that it is the unique possible measure of non-specificity in the 
theory of belief functions under some assumptions such as 
symmetry, additivity, sub-additivity, continuity, branching and 
normalization. 

If the measure of the non-specificity on a bba is low, we can 
consider the bba is specific. Yager in [21] defined a specificity 
measure such as: 

> 


XEG®, X40 


m(X) 
|X] 





S(m) = (7) 


Both definitions corresponded to an accumulation of a 
function of the basic belief assignment on the focal elements. 
Unlike the classical set theory, we must take into account the 
bba in order to quantify (to weight) the belief of the imprecise 
focal elements. The imprecision of a focal element can of 
course be given by the cardinality of the element. 

First of all, we must be able to compare the non-specificity 
(or specificity) between several bba’s, event if these bba’s are 
not defined on the same discernment space. That is not the 
case with the equations (6) and (7). The non-specificity of the 
equation (6) takes its values in [0, log,(|©|)]. The specificity 
of the equation (7) can have values in foe 1]. We will show 
how we can easily define a degree of non-specificity in [0, 1]. 
We could also define a degree of specificity from the equation 
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(7), but that is more complicated and we will later show how 
we can define a specificity degree. 

The most non-specific bba’s for both equations (6) and (7) 
are the total ignorance bba given by the categorical bba me : 
m(©) = 1. We have NS(m) = log,(|O|) and S(m) = Oe 
This categorical bba is clearly the most non-specific for us. 
However, the most specific bba’s are the Bayesian bba’s. The 
only focal elements of a Bayesian bba are the simple elements 
of ©. On these kinds of bba m we have NS(m) = 0 and 
S(m) = 1. For example, we take the three Bayesian bba’s 
defined on © = {01, 02,03} by: 


m1 (01) = mı (02) = mi (93) = 1/8, (8) 
m2(01) = mə(02) = 1/2, m2(03) = 0, (9) 
m3(01) = 1, m3(@2) = m3(43) = 0. (10) 


We obtain the same non-specificity and specificity for these 
three bba’s. 

That hurts our intuition; indeed, we intuitively expect that 
the bba ms is the most specific and the m; is the less specific. 
We will define a degree of specificity according to a most 
specific bba that we will introduce. 


C. Discord 


Different kinds of discord have been defined as extensions 
for belief functions of the Shannon entropy, given for the 
probabilities. Some discord measures have been proposed from 
plausibility, credibility or pignistic probability: 


E(m)=— $. m(X)log,(pl(X)), (11) 
XEGe 

C(m) =— S$ > m(X)logs(bel(X)), (12) 
XEG® 

D(m) =— X m(X)log)(betP(X)), (13) 
XEGe 


with E(m) < D(m) < C(m). We can also give the Shanon 
entropy on the pignistic probability: 


— © betP(X) log, (betP(X)). 


XEG2 


(14) 


Other measures have been proposed, [8] has shown that these 
measures can be resumed by: 


— X m(X)logs(1 — Conn, (X)), 


XEG® 


(15) 


where Con is called a conflict measure of the bba m on 
X. However, in our point of view that is not a conflict 
such presented in [20], but a contradiction. We give the both 
following definitions: 

Definition A contradiction in the theory of belief functions 
quantifies how a bba m contradicts itself. 

Definition (C1) The conflict in the theory of belief functions 
can be defined by the contradiction between 2 or more bba’s. 

In order to measure the conflict in the theory of belief 
functions, it was usual to use the mass k given by the 
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conjunctive combination rule on the empty set. This rule is 
given by two basic belief assignments mı and mz and for all 
X €G® by: 


m(X) = 5 mı(A)mə(B) := (mı @ m2)(X). d6) 
ANB=X 
k = m.(Q) can also be interpreted as a non-expected solution. 

In [21], Yager proposed another conflict measure from the 
value of k given by — log,(1 — k). 

However, as observed in [9], the weight of conflict given 
by k (and all the functions of k) is not a conflict measure 
between the basic belief assignments. Indeed this value is 
completely dependant of the conjunctive rule and this rule 
is non-idempotent - the combination of identical basic belief 
assignments leads generally to a positive value of k. To 
highlight this behavior, we defined in [12] the auto-conflict 
which quantifies the intrinsic conflict of a bba. The auto- 
conflict of order n for one expert is given by: 


an= (8 m) 0. 


The auto-conflict is a kind of measure of the contradiction, 
but depends on the order. We studied its behavior in [11]. 
Therefore we need to define a measure of contradiction 
independent on the order. This measure is presented in the 
next section III. 


n 
eam 
i=1 


(17) 


III. A CONTRADICTION MEASURE 


The definition of the conflict (C1) involves firstly to measure 
it on the bba’s space and secondly that if the opinions of two 
experts are far from each other, we consider that they are in 
conflict. That suggests a notion of distance. That is the reason 
why in [11], we give a definition of the measure of conflict 
between experts assertions through a distance between their 
respective bba’s. The conflict measure between 2 experts is 


defined by: 
Conf(1,2) = d(m,, mz). (18) 


We defined the conflict measure between one expert 7 and the 
other M — 1 experts by: 





1 M 
=F 5 Conf (i, j), 


C i,E) = 1 
onf (i, €) M PAR (19) 
j=1,iŻj 
where € = {1,..., M} is the set of experts in conflict with i. 
Another definition is given by: 
Conf(i, M) = d(m;, mM), (20) 


where Mw is the bba of the artificial expert representing the 
combined opinions of all the experts in € except i. 

We use the distance defined in [6], which is for us the most 
appropriate, but other distances are possible. See [4] for a 
comparison of distances in the theory of belief functions. This 
distance is defined for two basic belief assignments m; and 
ms on G® by: 


domma) =y 





1 


5 (M1 — m2)" D(m; — mə), 


(21) 
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where D is an GIE! x GI®! matrix based on Jaccard distance 
whose elements are: 





1, if A= B=, 
D(A, B) = |A (22) 
; N B| A 
VA,B € G®. 
muB eee 


However, this measure is a total conflict measure. In order 
to define a contradiction measure we keep the same spirit. 
First, the contradiction of an element X with respect to a bba 
m is defined as the distance between the bba’s m and mx, 
where mx(X)=1,X € G®, is the categorical bba: 


Contrm( X) = d(m,mx), (23) 


where the distance can also be the Jousselme distance on the 
bba’s. The contradiction of a bba is then defined as a weighted 
contradiction of all the elements X of the considered space 
G9: 
Contr, = 2 S m(X)d(m,mx). 
xXEGe 
The factor 2 is given to obtain values in [0, 1] as shown in 
the following illustration. 


(24) 


A. Illustration 


First we note that for all categorical bbas my, the contra- 
diction given by the equation (23) gives Contrm,(Y) = 0 
and the contradiction given by the equation (24) brings also 
Contrm, = 0. Considering the bba mı(0ı) = 0.5 and 
my1(02) = 0.5, we have Contr,,, = 1. That is the maximum 
of the contradiction, hence the contraction of a bba takes its 
values in [0, 1]. 


Figure 1. Bayesian bba’s 
0 
a (is) 


Taking the Bayesian bba given by: m2(61) = 0.6, m2(@2) = 


0.3, and m2(63) = 0.1. We obtain: 


Contrm, (01) ~ 0.36, 
Contrm,(62) œ 0.66, 
Contrm,(@3) œ 0.79 


The contradiction for ma is Contrm, = 0.9849. 

Take m3({61, 02, 03}) = 0.6, m3(62) = 0.3, and m3(@3) = 
0.1; the masses are the same than mg, but the highest one is 
on © = {6}, 02,43} instead of 6;. We obtain: 


Contrm, ({41,02,43}) œ 0.28, 
Contrm,(92) œ 0.56, 
Contrm,(93) œ~ 0.71 
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Figure 2. Non-dogmatic bba 


0.6 


The contradiction for m3 is Contrm, = 0.8092. We can see 
that the contradiction is lowest thanks to the distance taking 
into account the imprecision of O. 








Figure 3. Focal elements of cardinality 2 


) 
@) 


If we consider now the same mass values but on 
focal elements of cardinality 2: m4({01,02}) = 0.6, 
ma(61, 03) = 0.3, and m4(@2, 63) = 0.1. We obtain: 


ma: 


Contrm,({01,42}) ~ 0.29, 
Contrm,({01,63}) ~ 0.53, 
Contrm, ({92,03}) ~ 0.65 


The contradiction for m4 is Contr,,, = 0.80. 


Fewer of focal elements there are, smaller the contradiction 
of the bba will be, and more the focal elements are precise, 
higher the contradiction of the bba will be. 


IV. DEGREES OF UNCERTAINTY 


We have seen in the section II that the measure non- 
specificity given by the equation (6) take its values in a space 
dependent on the size of the discernment space O. Indeed, the 
measure of non-specificity takes its values in [0, log, (|®])]. 

In order to compare some non-specificity measures in an 
absolute space, we define a degree of non-specificity from the 
equation (6) by: 


a log (|X|) 
ism) = D> ESN 
XEG®S, X#0 (25) 
= XO m)ga (|X). 


XEG®, X#0 


Therefore, this degree takes its values into [0, 1] for all bba’s 
m, independently of the size of discernment. We still have 


Table I 
EVALUATION OF BAYESIANITY ON EXAMPLES 














mı mə m3 ma ms5 meg me 
1 0.4 0.3 0.1 0.3 0 0 0 
02 0.1 0.1 0.3 0.1 0 0 0 
03 0.1 0.1 0.1 0.1 0 0 0 





01 U 02 0.3 0.3 0.5 0 0.6 0.6 0 
01 U 03 0.1 0.2 0 0 0.4 0 0 




















02 U 03 0 0 0 0 0 0 0 
© 0 0 0 0.5 0 0.4 

OB 0.75 | 0.68 | 0.68 | 0.5 | 0.37 | 0.23 0 

ons 0.25 | 0.32 | 0.32 | 0.5 | 0.63 | 0.77 1 
































dns(me) = 1, where me is the categorical bba giving the 
total ignorance. Moreover, we obtain dns(m) = 0 for all 
Bayesian bba’s. 

From the definition of the degree of non-specificity, we can 
propose a degree of specificity such as: 


logs (|X 
dp(m) = 1- 5 m(X) TE 
XEG®, X40 (26) 
= 1- So m(X)loge (XI. 


XEG®, X40 


As we observe already the degree of non-specificity given 
by the equation (26) does not really measure the specificity 
but the Bayesianity of the considered bba. This degree is equal 
to 1 for Bayesian bba’s and not one for other bba’s. 

Definition The Bayesianity in the theory of belief functions 
quantify how far a bba m is from a probability. 

We illustrate these degrees in the next subsection. 


A. Illustration 


In order to illustrate and discuss the previous introduced 
degrees we take some examples given in the table I. The 
bba’s are defined on 2° where © = {01,02,03}. We only 
consider here non-Bayesian bba’s, else the values are still 
given hereinbefore. 

We can observe for a given sum of basic belief on the 
singletons of O the Bayesianity degree can change according 
to the basic belief on the other focal elements. For example, 
the degrees are the same for mz and ms, but different for m4. 
For the bba m4, note that the sum of the basic beliefs on the 
singletons is equal to the basic belief on the ignorance. In this 
case the Bayesianity degree is exactly 0.5. That is conform to 
the intuitive signification of the Bayesianity. If we look ms and 
me, We first note that there is no basic belief on the singletons. 
As a consequence, the Bayesianity is weaker. Moreover, the 
bba ms is naturally more Bayesian than mg because of the 
basic belief on O. 

We must add that these degrees are dependent on the 
cardinality of the frame of discernment for non Bayesian bba’s. 
Indeed, if we consider the given bba m; with © = {6}, 02, 63}, 
the degree dg = 0.75. Now if we consider the same bba 
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with © = {01, 02,03,04} (no beliefs are given on 64), the 
Bayesianity degree is 6g = 0.80. The larger is the frame, the 
larger becomes the Bayesianity degree. 


V. DEGREE OF SPECIFICITY 


In the previous section, we showed the importance to con- 
sider a degree instead of a measure. Moreover, the measures 
of specificity and non-specificity given by the equations (7) 
and (6) give the same values for every Bayesian bba’s as seen 
on the examples of the section II. 

We introduce here a degree of specificity based on compar- 
ison with the bba the most specific. This degree of specificity 
is given by: 


ds(m) = 1 — d(m, ms), (27) 


where m, is the bba the most specific associated to m and 
d is a distance defined onto [0,1]. Here we also choose the 
Jousselme distance, the most appropriated on the bba’s space, 
with values onto [0, 1]. This distance is dependent on the size 
of the space G®, we have to compare the degrees of specificity 
for bba’s defined from the same space. Accordingly, the main 
problem is to define the bba the most specific associated to 
m. 


A. The most specific bba 


In the theory of belief functions, several partial orders 
have been proposed in order to compare the bba’s [3]. These 
partial ordering are generally based on the comparisons of 
their plausibilities or their communalities, specially in order 
to find the least-committed bba. However, comparing bba’s 
with plausibilities or communality can be complex and without 
unique solution. 

The problem to find the most specific bba associated to a bba 
m does not need to use a partial ordering. We limit the specific 
bba’s to the categorical bba’s: mx(X) = 1 where X € G? 
and we will use the following axiom and proposition: 

Axiom For two categorical bba’s mx and my, mx is more 
specific than my if and only if |X| < |Y]. 

In case of equality, the both bba’s are isospecific. 

Proposition [f we consider two isospecific bba’s mx and 
my, the Jousselme distance between every bba m and mx is 
equal to the Jousselme distance between m and my: 





(28) 


Ym, d(m, mx) = d(m,my) 


if and only if m(X) = m(Y). 

Proof The proof is obvious considering the equations (21) 
and (22). As the both bba’s mx and my are categoric there is 
only one non null term in the difference of vectors m—mx and 
m—my. These both terms ax and ay are equal, because mx 
and my are isospecific and so according to the equation (22) 
D(Z,X) = D(Z,Y) YZ € G®. Therefore m(X) = m(Y), 
that proves the proposition 

We define the most specific bba ms associated to a bba 
m as a categorical bba as follows: ms(Xmax) = 1 where 
Xmax € G®. 

Therefore, the matter is now how to find Xmax. We propose 
two approaches: 
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First approach, Bayesian case 


If m is a Bayesian bba we define Xmax such as: 
Xmax = arg max(m (X), X € ©). (29) 


If there exist many Xmax (ie. having the same 
maximal bba and giving many isospecific bba’s), 
we can take any of them. Indeed, according to the 
previous proposition, the degree of specificity of m 
will be the same with m, given by either Xmax 
satisfying (29). 
First approach, non-Bayesian case 

If m is a non-Bayesian bba, we can define Xmax in 
a similar way such as: 


(X 

x| 
In fact, this equation generalizes the equation (29). 
However, in this case we can also have several Xmax 
not giving isospecific bba’s. Therefore, we choose 
one of the more specific bba, i.e. believing in the 
element with the smallest cardinality. Note also that 
we keep the terms of Yager from the equation (7). 

Second approach 





) x GEX o) (80) 


m 
Xmax = arg max ( | 


Another way in the case of non-Bayesian bba m is 
to transform m into a Bayesian bba, thanks to one of 
the probability transformation such as the pignistic 
probability. Afterwards, we can apply the previous 
Bayesian case. With this approach, the most specific 
bba associated to a bba m is always a categorical 
bba such as: ms(Xmax) = 1 where Xmax € © and 
not in G®. 


B. Illustration 


In order to illustrate this degree of specificity we give some 
examples. The table II gives the degree of specificity for 
some Bayesian bba’s. The smallest degree of specificity of 
a Bayesian bba is obtained for the uniform distribution (m1), 
and the largest degree of specificity is of course obtain for 
categorical bba (ms). 

The degree of specificity increases when the differences 
between the mass of the largest singleton and the masses 
of other singletons are getting bigger: 65(™m3) < ðs(m4) < 
ds(ms) < dg(mg). In the case when one has three disjoint 
singletons and the largest mass of one of them is 0.45 (on 01), 
the minimum degree of specificity is reached when the masses 
of 62 and 63 are getting further from the mass of 6; (me). If 
two singletons have the same maximal mass, bigger this mass 
is and bigger is the degree of specificity: dg(mz) < dg(ms3). 

In the case of non-Bayesian bba, we first take a simple 
example: 


m4 (61) = 0.6, 
m2(1) = 0.5, 


My (0; U 62) = 0.4 
mə(0ı U 62) = 0.5. 


G1) 
(32) 


For these two bba’s m; and mg, both proposed approaches 
give the same most specific bba: mo,. We obtain ôs(m1ı) 
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Table II 
ILLUSTRATION OF THE DEGREE OF SPECIFICITY ON BAYESIAN BBA. 

04 02 03 s 
mı 1/3 1/3 1/3 0.423 
me 0.4 0.4 0.2 0.471 
mg | 0.45 | 0.45 0.10 | 0.493 
ma | 0.45 | 0.40 0.15 | 0.508 
ms | 0.45 0.3 0.25 | 0.523 
me | 0.45 | 0.275 | 0.275 | 0.524 
m7 0.6 0.3 0.1 0.639 
mg 1 0 0 1 























0.7172 and ðs (m2) = 0.6465. Therefore, mı is more specific 
than mə. Remark that these degrees are the same if we 
consider the bba’s defined on 2°? and 29°, with © = {01, 02} 
and ©3 = {01,02,03}. If we now consider Bayesian bba 
m3(01) = m3(02) = 0.5, the associated degree of specificity 
is 6g(m3) = 0.5. As expected by intuition, mz is more specific 
than mg. 
We consider now the following bba: 


ma(01) = 0.6, m1 (01 U A U 63) = 0.4. (33) 


The most specific bba is also mg,, and we have dg(m4) = 
0.6734. This degree of specificity is naturally smaller than 
ðs(mı) because of the mass 0.4 on a more imprecise focal 
element. 

Let’s now consider the following example: 


We do not obtain the same most specific bba with both 
proposed approaches: the first one will give the categorical 
bba mo ug, and the second one mg,. Hence, the first degree 
of specificity is ds(ms5) = 0.755 and the second one is 
ds(ms) = 0.111. We note that the second approach produces 
naturally some smaller degrees of specificity. 


C. Application to measure the specificity of a combination rule 


We propose in this section to use the proposed degree of 
specificity in order to measure the quality of the result of 
a combination rule in the theory of belief functions. Indeed, 
many combination rules have been developed to merge the 
bba’s [10], [19]. The choice of one of them is not always 
obvious. For a special application, we can compare the pro- 
duced results of several rules, or try to choose according to the 
special proprieties wanted for an application. We also proposed 
to study the comportment of the rules on generated bba’s 
[12]. However, no real measures have been used to evaluate 
the combination rules. Hereafter, we only show how we can 
use the degree of specificity to evaluate and compare the 
combination rules in the theory of belief functions. A complete 
study could then be done for example on generated bba’s. 
We recall here the used combination rules, see [10] for their 
description. 
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The Dempster rule is the normalized conjunctive combi- 
nation rule of the equation (16) given for two basic belief 
assignments mı and mz and for all X € G®, X £0 by: 


1 
mps(X) = 7} 5 mı(A)m2(B). 
ANB=X 


(35) 


where k is either m,(0) or the sum of the masses of the 
elements of Ø equivalence class for D®. 

The Yager rule transfers the global conflict on the total 
ignorance O: 


m(X) if X € 2° \ {0,0} 
my(X) =< m.(0)+m-(0) if X =O (36) 
0 if X =0 


The disjunctive combination rule is given for two basic 
belief assignments mı and mz and for all X € G® by: 


Mpis(X) = 5 mı(A)mə(B). 
AUB=X 


(37) 


The Dubois and Prade rule is given for two basic belief 
assignments mı and mz and for all X € G°, X Æ f by: 


mpp(X) = X. mi(A)me(B)+ XD mi(A)ma(B). (38) 


ANBETA AUB=X 


ANB=0 


The PCR rule is given for two basic belief assignments mı 
and my and for all X € G°, X Æ Í by: 
) (39) 


mpcr(X) = m.(X) + 

5 ( m1(X)?m2(Y) 

my1(X)+m2(Y) 

YEG®, 
XnNY=0 

The principle is very simple: compute the degree of speci- 
ficity of the bba’s you want combine, then calculate the degree 
of specificity obtained on the bba after the chosen combination 
rule. The degree of specificity can be compared to the degrees 
of specificity of the combined bba’s. 

In the following example given in the table II we com- 
bine two Bayesian bba’s with the discernment frame O = 
{01,02,03}. Both bba’s are very contradictory. The values 
are rounded up. The first approach gives the same degree of 
specificity than the second one except for the rules mpis, Mpp 
and my. We observe that the smallest degree of specificity is 
obtained for the conjunctive rule because of the accumulated 
mass on the empty set not considered in the calculus of the 
degree. The highest degree of specificity is reached for the 
Yager rule, for the same reason. That is the only rule given a 
degree of specificity superior to 6g(m ,) and to ĝs(m2). The 
second approach shows well the loss of specificity with the 
rules Mpis, My and mpp having a cautious comportment. 
With the example, the degree of specificity obtained by the 
combination rules are almost all inferior to ôs(mı) and to 
ds(mz), because the bba’s are very conflicting. If the degrees 
of specificity of the rule such as mpg and mpor are superior 


m2(X)?m1(Y) 
m(X)+mi(Y) 
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Table IM 
DEGREES OF SPECIFICITY FOR COMBINATION RULES ON BAYESIAN BBA’S. 












































mı m2 Me | mps | my MDis MDP | MPCR 
0 0 0 0.76 0 0 0 0 0 
1 0.6 0.2 0.12 0.50 0.12 0.12 0.12 0.43 
02 0.1 0.6 0.06 0.25 0.06 0.06 0.06 0.37 
03 0.3 0.2 0.06 0.25 0.06 0.06 0.06 0.20 
6, Ub2| 0 0 0 0 0 0.38 0.38 0 
6, U63; 0 0 0 0 0 0.18 0.18 0 
62 U63| 0 0 0 0 0 0.20 0.20 0 
(S) 0 0 0 0 0.76 0 0 0 
ms l- | mo, | mo, | Mo, | Mmo, | Mo | Moruo |™o,U0] Meo, 
Ms 2- | me, Moy me, me, me, me, me, me, 
ôs 1- | 0.639 | 0.655 |0.176 | 0.567 | 0.857 0.619 0.619 | 0.497 
ôs 2- | 0.639 | 0.655 |0.176 | 0.567 | 0.457 0.478 0.478 | 0.497 



































to ds(m 1) and to dg(mz2), that means that the bba’s are not 
in conflict. 

Let’s consider now a simple non-Bayesian example in 
table IV. 


Figure 4. Two non-Bayesian bba’s 























0.1 





VI. CONCLUSION 


First, we propose in this article a reflection on the mea- 
sures on uncertainty in the theory of belief functions. A lot 
of measures have been proposed to quantify different kind 
of uncertainty such as the specificity - very linked to the 
imprecision - and the discord. The discord, we do not have 
to confuse with the conflict, is for us a contradiction of a 
source (giving information with a bba in the theory of belief 
functions) with oneself. We distinguish the contradiction and 
the conflict that is the contradiction between 2 or more bba’s. 
We introduce a measure of contradiction for a bba based on 
the weighted average of the conflict between the bba and the 
categorical bba’s of the focal elements. 

The previous proposed specificity or non-specificity mea- 
sures are not defined on the same space. Therefore that is 
difficult to compare them. That is the reason why we propose 
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Table IV 
DEGREES OF SPECIFICITY FOR COMBINATION RULES ON NON-BAYESIAN 
BBA’S. 
mı m2 Me | mps | my MDis MDP | ™MPCR 

0 0 0 0.47 0 0 0 0 0 

04 0.4 0.2 0.2 0.377 0.2 0.08 0.2 0.39 

02 0.1 0.3 0.17 | 0.321 | 0.17 0.03 0.17 0.28 

03 0.3 0.1 0.12 | 0.226 | 0.12 0.03 0.12 0.24 
6; U 62| 0.2 0.1 0.04 | 0.076 | 0.04 0.31 0.18 0.06 
6, Uz) 0 0 0 0 0 0.1 0.1 0 
62 U63) 0 0.2 0 0 0 0.18 0.1 0.03 

(S) 0 0.1 0 0 0.47 0.27 0.13 0 
ms l- | mo, | Mos | Mo, | Me, | Mo, | ™Me,L62 | Mo, Mo, 
Ms 2- | Mmo, | Mo, | Me, me, mo, me, mo, mo, 
dg 1- |0.553 | 0.522 | 0.336 | 0.488 | 0.389 0.609 0.428 | 0.497 
dg 2- |0.553 | 0.522 | 0.336 | 0.488 | 0.389 0.456 0.428 | 0.497 



































the use of degree of uncertainty. Moreover these measures give 
some counter-intuitive results on Bayesian bba’s. We propose 
a degree of specificity based on the distance between a mass 
and its most specific associated mass that we introduce. This 
most specific associated mass can be obtained by two ways and 
give the nearest categorical bba for a given bba. We propose 
also to use the degree of specificity in order to measure the 
specificity of a fusion rule. That is a tool to compare and 
evaluate the several combination rules given in the theory of 
belief functions. 
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Abstract 


In this paper, we present an extension of the multi-criteria decision making based on the Analytic Hierarchy Process (AHP) 
which incorporates uncertain knowledge matrices for generating basic belief assignments (bba’s). The combination of priority 
vectors corresponding to bba’s related to each (sub)-criterion is performed using the Proportional Conflict Redistribution rule no. 
5 proposed in Dezert-Smarandache Theory (DSmT) of plausible and paradoxical reasoning. The method presented here, called 
DSmT-AHFP, is illustrated on very simple examples. 


Keywords: Analytic Hierarchy Process (AHP), Dezert-Smarandache Theory (DSmT), Evidential Reasoning, Information Fusion, 
Decision Making, Multi-Criteria Analysis (MCA). 


I. INTRODUCTION 


In many real-life contexts, decisions are based on imperfect information provided by several more or less reliable and conflicting 
sources. Several recent developments have tried to introduced belief function theory [24] into the AHP framework. A first 
attempt has been done to consider imprecise evaluations of alternatives using the Dempster-Shafer theory and the Dempster 
tule [2]. However, the classical fusion rules such as Dempster rule is known to poorly take conflict into account. Another new 
framework called ER-MCDA has been proposed to mix a multi-criteria decision making method called Analytic Hierarchy 
Process (AHP) and uncertainty theories including fuzzy sets, possibility and belief function theories [28]. The principle of 
the ER — MCDA methodology is to use AHP to analyze the decision problem and to replace the aggregation step by two 
successive fusion processes [29]. Its main objective is to take into account both information imperfection, source reliability 
and conflict. When doing this, an important problem occurs since the importance of criteria is a different concept than the 
classical reliability concept developped and used in the belief theory context. This article presents a new development related 
to the use of fusion in the context of multicriteria decision analysis, focusing on the special case of AHP. First, we present 
a method called DSmT-AHP which is an adapted version of AHP allowing to consider imprecise and uncertain evaluation of 
alternatives. Secondly, we describe a new fusion rule that has been developped on the basis of the PCR rule proposed in the 
context of DSmT Theory to implement this method. This new rule is also used in the context of the ER-MCDA method. 


II. DSMT-AHP APPROACH 


We present briefly the extension of Saaty’s AHP method [17] using an aggregation method developed in Dezert-Smarandache 
Theory (DSmT) framework [25] of evidential reasoning (ER), able to make a difference between importance of criteria, 
uncertainty related to the evaluations of criteria and reliability of the different sources. This method has been introduced for 
the first time in [8] and applied for risk prevention of natural hazards in mountains in [29]. Before explaining the principle of 
DSmT-AHP, it is necessary to recall some basics of DSmT at first to make the paper self-consistent for readers not familiar with 
this theoretical framework. All basis with many detailed examples can be obtained freely by downloading the three volumes 
given in [25] from the web. 


A. DSmT basics 


Let start with © = {6,,62,--- ,0,} be a finite set of n elements assumed to be exhaustive. © corresponds to the frame of 
discernment of the problem under consideration. In general, we assume that elements of © are non exclusive in order to deal 
with vague/fuzzy and relative concepts [25], Vol. 2. This is the so-called free-DSm model. In DSmT, there is no need to 
work on a refined frame consisting in a discrete finite set of exclusive and exhaustive hypotheses (referred as Shafer’s model), 
because DSm rules of combination work for any models of the frame. The hyper-power set DÊ is defined as the set of all 
propositions built from elements of © with U and N, see [25], Vol. 1 for examples. A (quantitative) basic belief assignment 
(bba) expressing the belief committed to the elements of D? by a given source is a mapping m/(-): DO — [0,1] such that: 
m(0) = 0 and X jepe m(A) = 1. Elements A € D? having m(A) > 0 are called focal elements of m(.). The credibility 
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and plausibility functions are defined in almost the same manner as in Dempster-Shafer Theory (DST) [24] except that 2° is 
replaced by D® in their definitions. In DSmT, the Proportional Conflict Redistribution Rule no. 5 (PCRS) is used generally to 
combine bba’s. PCRS transfers the conflicting mass only to the elements involved in the conflict and proportionally to their 
individual masses, so that the specificity of the information is entirely preserved in this fusion process. For simplicity, we work 
in the power set 2° since most of readers are usually already familiar with this fusion space. Let’s m1(.) and m2(.) be two 
independentbba’s, then the PCRS rule is defined as follows: mpors(0) = 0 and YX € 2° \ {0} 


mı (XP m(X2) m2(X)?m1(X2) 
mi(X) + m2(X2) — m2(X) + mi(X2) 





mpcrs(X)= J mi(Xi)ma(X2)+ SO I (1) 


X1,X2€2° X2€2° 

X1NX2=X X2nxX=0 
where all denominators in (1) are different from zero. If a denominator is zero, that fraction is discarded. All propositions/sets 
are in a canonical form. A variant of (1), called PCR6, for combining s > 2 sources and for working in other fusion spaces 
(hyper-power sets or super power-sets) is presented in [25]. Additional properties of PCR5 can be found in [7]. Extension of 
PCRS for combining qualitative bba’s can be found in [25], Vol. 2 & 3. 
As a simple example, let’s consider two bba’s mı(.) and m2(.), AN B = @ for the model of ©, and m (A) = 0.6 and 
m2(B) = 0.3. With PCRS the partial conflicting mass m,(A)m2(B) = 0.6 - 0.3 = 0.18 is redistributed to A and B only with 
respect to the following proportions respectively: 74 = 0.12 and zg = 0.06 because 

LA TB mı(A)mə(B) 0.18 _ 


TM, Ha). a a 





B. DSmT-AHP approach 


DSmT-AHP aimed to perform a similar purpose as AHP [16], [17], SMART [30] or DS/AHP [2], [4], etc. that is to find the 
preferences rankings of the decision alternatives (DA), or groups of DA. DSmT-AHP approach consists in three steps: 


e Step 1: We extend the construction of the matrix for taking into account the partial uncertainty (disjunctions) between 
possible alternatives. If no comparison is available between elements, then the corresponding elements in the matrix is 
zero. Each bba related to each (sub-) criterion is the normalized eigenvector associated with the largest eigenvalue of the 
”uncertain” knowledge matrix (as done in standard AHP approach). 

e Step 2: We use the DSmT fusion rules, typically the PCRS rule, to combine bba’s drawn from step | to get a final Multi- 
Criteria Decision-Making (MCDM) priority ranking. This fusion step must take into account the different importances (if 
any) of criteria as it will be explained in the sequel. 

e Step 3: Decision-making can be done based either on the maximum of belief (also called credibility), or on the maximum 
of the plausibility of Decision alternatives (DA), as well as on the maximum of the approximate subjective probability of 
DA obtained by different probabilistic transformations like the Pignistic, DSmP, or Sudano’s transformations [25], Vol. 2. 


The MCDM problem deals with several criteria having different importances and the classical fusion rules cannot be applied 
directly as in step 2. In AHP, the fusion is done from the product of the bba’s matrix with the weighting vector of criteria. 
Such AHP fusion is nothing but a simple componentwise weighted average of bba’s and it doesn’t actually process efficiently 
the conflicting information between the sources. It doesn’t preserve the neutrality of a full ignorant source in the fusion. To 
palliate these problems, we recall the new solution for combining sources of different importances proposed in [26]. The 
reliability is an objective property of a source, whereas the importance of a source is a subjective characteristic expressed by 
the fusion system designer. The reliability of a source represents its ability to provide the correct assessment/solution of the 
given problem. It is characterized by a discounting reliability factor, usually denoted œ in [0,1], which should be estimated 
from statistics when available, or by other techniques [12]. The reliability can be context-dependent. By convention, we usually 
take œ = 1 when the source is fully reliable and a = 0 if the source is totally unreliable. The reliability of a source is usually 
taken into account with Shafer’s discounting method [24] defined by: 


-m(X), for xX #0 


Ma(O) =a-m(@) + (1-a) (2) 


The importance of a source is not the same as its reliability and we characterize it by an importance factor 8 in [0,1] which 
represents somehow the weight of importance granted to the source by the fusion system designer. The choice of 8 is usually 
not related with the reliability of the source and can be chosen to any value in [0, 1] by the designer for his/her own reason. By 
convention, the fusion system designer will take 6 = 1 when he/she wants to grant the maximal importance of the source in the 
fusion process, and will take 6 = 0 if no importance at all is granted to this source in the fusion process. The fusion designer 
must be able to deal with importance factors in a different way than with reliability factors since they correspond to distinct 
properties associated with a source of information. The importance of a source is particularly crucial in hierarchical multi- 
criteria decision making problems, specially in the AHP [17], [21]. That’s why it is primordial to show how the importance 


96 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


can be efficiently managed in evidential reasoning approaches. To take into account the importance of the sources we use the 
dual of Shafer’s discounting approach for reliabilities of sources as originally introduced briefly by Tacnet in [25], Vol.3, Chap. 
23, p. 613. It consists to define the importance discounting with respect to the empty set rather than the total ignorance © (as 
done with Shafer’s discounting). Such new discounting deals easily with sources of different importances and is very simple 
to use. Mathematically, we define the importance discounting of a source m(.) having the importance factor 6 in [0,1] by: 


mg(X)=6-m(X), for xX 40 
ma() = 8 : m(0) + (1 — B) 


Here we allow to deal with non-normal bba since mg()) > 0 as suggested by Smets in [27]. This new discounting preserves 
the specificity of the primary information since all focal elements are discounted with same importance factor. Here we use 
the positive mass of the empty set as an intermediate/preliminary step of the fusion process. Clearly when 8 = 1 is chosen 
by the fusion designer, it will mean that the source must take its full importance in the fusion process and so the original bba 
m(.) is kept unchanged. If the fusion designer takes 8 = 0, one will deal with mg(0) = 1 which is interpreted as a fully non 
important source. m() > 0 is not interpreted as the mass committed to some conflicting information (classical interpretation), 
nor as the mass committed to unknown elements when working with the open-world assumption (Smets interpretation), but 
only as the mass of the discounted importance of a source in this particular context. Based on this discounting, one adapts 
PCR5 (or PCR6) rule for N > 2 discounted bba’s mg,;(.), i = 1,2,...N by considering the following extension, denoted 
PCR5g, defined by: VX € 2° 


(3) 


m1(X)?m2(X2) — m(X} mı (X2) 





monty (X) = D maX maa) t De aX) + m2(X2) m + m a) c% 
X1, X2€2° X2E2° 
XıNX2 =X X2NX=0 


A similar extension can be done for PCR5 and PCR6 formulas for N > 2 sources given in [25], Vol. 2. A detailed presentation 
of this technique with several examples has been done in [26] and thus it is not reported here due to space limitation 
constraints. The difference between eqs. (1) and (4) is that mpcrs(0) = 0 whereas mpcrs5, (0) > 0. Since we usually work 
with normal bba’s for decision making support, the combined bba will be normalized. In the AHP context, the importance 
factors correspond to the components of the normalized eigenvector w. It is important to note that such importance discounting 
method cannot be used in DST when using Dempster-Shafer’s rule of combination because this rule is not responding to the 
discounting of sources towards the empty set (see Theorem 1 in [26] for proof). 


We have shown how the reliability and importance of sources can be taken into account in the fusion process separately. The 
possibility to take them into account jointly is more difficult, because in general the reliability and importance discounting 
approaches do not commute, but when a; = 8; = 1. Indeed, it can be easily verified that Mma, 6; (-) Æ ™,,a;(.) whenever 
a; #1 and 6; A 1. Ma, gi (-) denotes the reliability discounting of m;(.) by a; followed by the importance discounting of 
Ma,(.) by 6; which explains the notation a;, 8; used in index. Similarly, mg,..,(.) denotes the importance discounting of 
m,(.) by 8; followed by the reliability discounting of mg,(.) by a;. In order to deal both with reliabilities and importances 
factors and because of the non commutativity of these discountings, we have proposed two methods in [26] to proceed the 
fusion of the sources in a three-steps process as follows: 


e Method 1: Step 1: Apply reliability and then importance discountings to get ma,,a,(.), i = 1,...,5 and combine them 
with PC'R5g or PC’R6g and normalize the resulting bba; Step 2: Apply importance and then reliability discountings to get 
M6,,a;(.), i= 1,..., s and combine them with PC R5g or PC R6g and normalize the resulting bba; Step 3 (mixing/averaging): 
Combine the resulting bba’s of Steps 1 and 2 using the arithmetic mean operator’. 


e Method 2: Another simplest method which could be useful for intermediate traceability in some applications would consist 
to change Steps 1 & 2 by Step 1’: Apply reliability discounting only to get m,,(.) and combine them with PCRS or PCR6; 
Step 2’: Apply importance discounting only to get mg,(.) and combine them with PC'R5g or PC'R6g and normalize the 
result; Step 3’ same as Step 3. Due to space limitation, only Method 1 is briefly illustrated in the following simple example. 


‘Other combination rules could be used also like PCRS or PCR6, etc., but we don’t see solid justification to use them again and they require more 
computations than the simple arithmetic mean. 
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C. A simple example 


Let’s consider a set of three cars © = {A, B, C} with Shafer’s model, and only two criteria Cl=Fuel Economy, C2=Reliability. 
Let’s assume that with respect to each criterion the following “uncertain” preferences matrices are given: 


| A B AUC BUC 








A BUC © A : 

M(C1)=| „4n 1 0 I8 M(C2)=| B (1/2 1 1⁄2 15 
as 7 i j Auc|1/4 2 1 0 
BuC|1/3 5 0 1 


e Step 1 (bba’s generation): Applying AHP method, one gets the following priority vectors w(C1) ~ [0.0889 0.5337 0.3774]' 
and w(C2) = [0.5002 0.1208 0.1222 0.2568]’ which are identified with the bba’s mci(.) and mco(.) as follows: 
mci(A) = 0.0889, mcı(B UC) = 0.5337, mcı(A U BUC) = 0.3774 and mc2(A) = 0.5002, mc2(B) = 0.1208, 
me2(AUC) = 0.1222 and mc2(BU C) = 0.2568. 


e Step 2 (Fusion): When the two criteria have the same full importance in the hierarchy they are fused with one of the classical 
fusion rules. In [4] Beynon proposed to use Dempster’s rule. Here we propose to use the PCRS fusion rule since it is known 
to have a better ability to deal efficiently with possibly highly conflicting sources of evidences [25], Vol. 2. With PCRS, one 
gets: 





Elem. of 2° | moi(.) | moa(.) | meors(.) 
0 0 0 0 
A 0.0889 | 0.5002 0.3837 
B 0 0 0.1162 
AUB 0 0.1208 0 
G: 0 0 0.0652 
AUC 0 0.1222 0.0461 
BUC 0.5337 | 0.2568 0.3887 
AUBUC | 0.3774 0 0 











e Step 3 (Decision-making): A final decision based on mpcrs(.) must be taken. Usually, the decision-maker (DM) is 
concerned with a single choice among the elements of ©. Many decision-making approaches are possible depending on the 
risk the DM is ready to take. A pessimistic DM will choose the singleton of © giving the maximum of credibility whereas an 
optimistic DM will choose the element having the maximum of plausibility. A fair attitude consists usually in choosing the 
maximum of approximate subjective probability of elements of ©. The result however is very dependent on the probabilistic 
transformation (Pignistic, DSmP, Sudano’s, etc) [25], Vol. 2. We recall? that the credibility Bel(.), the pignistic probability 
BetP(.) and the plausibility Pl(.) functions satisfy for any A € 2° the following inequality: 


Bel(A) < BetP(A) < PI(A) (5) 


where Bel(A), PI(A) and BetP(A) are defined from any bba m(.) by the following formulas: 





Bel(A)= X` m(B) and P(A)= ` m(B) (6) 
BCA BNA#O 
Be2® Be2e 
BetP(A) = > LET (7) 
YeE2e IY 


and where |Y | denotes the cardinality of Y. By convention one takes |0|/|@| = 1. 
Below are the values of the credibility, the pignistic probability and the plausibility of A, B and C: 


Elem. of © | Bel(.) | BetP() | PIC) 





A 0.3837 | 0.4068 | 0.4298 
B 0.1162 | 0.3105 | 0.5049 
C 0.0652 | 0.2826 | 0.5000 











The car A will be preferred with the pessimistic (based on max of Bel(.)) or pignistic attitudes, whereas the car B will be 
preferred if an optimistic attitude is adopted since one has PI(B) > PI(C) > PI(A). 


2In this example, we work with Shafer’s model for the frame © so that DE =2°. 
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Now, if we assume that C2 (the reliability) is three times more important than C1 (fuel economy) so that the preference matrix 
between these two criteria is given by: 


1/1 


1/3] _ [1.0000 0.3333 
M= [a ti] [sti ranoo] 


1/1| ~ |3.0000 1.0000 


Its normalized principal eigenvector is w = [0.2500 0.7500)’ which indicates that C2 is indeed three times more important 
than C1 as expressed in the prior DM preferences for ranking criteria. w = [w1 we]’ can also be obtained directly by solving 
the algebraic system of equations w2 = 3w , and wı + w2 = 1 with w1, we € [0,1]. If we apply the importance discounting 


with 6; = w; = 0.25 and 82 = w2 = 0.75, one gets the following discounted bba’s 





Elem. of 2° | mg,,ci(-) | méo,c2(-) 
0 0.7500 0.2500 
A 0.0222 0.3751 
B 0 0 
AUB 0 0.0906 
C 0 0 
AUC 0 0.0917 
BUC 0.1334 0.1926 
AUBUC 0.0944 0 








With the PCRS5g fusion of the sources mg, c1(.) and mg, c2(.), one gets the results in the table. For decision-making support, 
one prefers to work with normal bba’s. Therefore mpcprs,(.) is normalized by redistributing back m pcrs (0) proportionally 
to the masses of other focal elements as shown in the right column of the next table. 














Elem. of 2° | mpors,(.) PERS ( ) 
i) 0.6558 0 
A 0.1794 0.5213 
B 0.0121 0.0351 
AUB 0.0159 0.0461 
C 0.0122 0.0355 
AUC 0.0161 0.0469 
BUC 0.1020 0.2963 
AUBUC 0.0065 0.0188 











If all sources have the same full importances (i.e. all 6;=1), then mpcrs,(.) = MPcrRs(.) which is normal because in such 


case mg,=1,0%(.) = mci (.). From moana. 


(.) one can easily compute the credibility Bel(.), the pignistic probability BetP(.) 


or the plausibility Pl(.) for each element of © for decision-making. In this example one gets: 





Elem. of © | Bel(.) | BetP(.) | PIC) 
A 0.5213 | 0.5741 | 0.6331 
B 0.0351 | 0.2126 | 0.3963 
C 0.0355 | 0.2134 | 0.3974 











If the classical AHP fusion method (i.e. weighted arithmetic mean) is used directly with bba’s mcı(.) and mc2(.), one gets: 


0 0 0 
0.0889 0.5002 0.3974 
0 0 0 
0 0.1208 0.25 0.0906 

manPl.)=|] 9 o |> k = 0 
0 0.1222 0.0917 
0.5337 0.2568 0.3260 
0.3774 0 0.0944 


which would have provided the following result for decision-making 





Elem. of © | Bel(.) | BetP() | PIC) 
A 0.3974 | 0.5200 | 0.6741 
B 0 | 0.2398 | 0.5110 
C 0 | 0.2403 | 0.5121 











In this very simple example, one sees that the importance discounting technique coupled with PCR5-based fusion rule (what 
we call the DSmT-AHP approach) will suggest, as with classical AHP, to choose the alternative A since the car A has a bigger 
credibility (as well as a bigger pignistic probability and plausibility) than cars B or C’. It is however worth to note that the 
values of Bel(.), BetP(.) and Pl(.) obtained by both methods are slightly different. The difference in results can have a strong 
impact in practice in the final result for example if the costs of vehicles have also to be included in the final decision. Note 
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also that the uncertainties U(X) = Pl(X) — Bel(X) of all alternatives X = A,B,C have been seriously diminished when 
using DSmT-AHP with respect to what we obtain with classical AHP as seen in the following table. The uncertainty reduction 
is a nice expected property specially important for decision-making support. 


Elem. of © | U(.) with AHP | U(.) with DSmT-AHP 





A 0.2767 0.1118 
B 0.5110 0.3612 
C 0.5121 0.3619 


II. CONCLUSIONS 


In this paper, we have presented a new method for Multi-Criteria Decision-Making (MCDM) and Multi-Criteria Group Decision- 
Making (MCGDM) based on the combination of AHP method developed by Saaty and DSmT. The AHP method allows to 
build bba’s from DM preferences of solutions which are established with respect to several criteria. The DSmT allows to 
aggregate efficiently the (possibly highly conflicting) bba’s based on each criterion. This DSmT-AHP method allows to take 
into account also the different importances of the criteria and/or of the different members of the decision-makers group. The 
application of this DSmT-AHP approach and specially the new fusion rule has been introduced in an application case for the 
prevention of natural hazards in mountains and snow avalanches, see [25], Vol.3, Chap. 23, and [29]. 
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Cautious OWA and Evidential Reasoning 
for Decision Making under Uncertainty 


Jean-Marc Tacnet 
Jean Dezert 


Abstract—To make a decision under certainty, multicriteria 
decision methods aims to choose, rank or sort alternatives on 
the basis of quantitative or qualitative criteria and preferences 
expressed by the decision-makers. However, decision is often 
done under uncertainty: choosing alternatives can have different 
consequences depending on the external context (or state of the 
word). In this paper, a new methodology called Cautious Ordered 
Weighted Averaging with Evidential Reasoning (COWA-ER) is 
proposed for decision making under uncertainty to take into 
account imperfect evaluations of the alternatives and unknown 
beliefs about groups of the possible states of the world (scenarii). 
COWA-ER mixes cautiously the principle of Yager’s Ordered 
Weighted Averaging (OWA) approach with the efficient fusion 
of belief functions proposed in Dezert-Smarandache Theory 
(DSmT). 

Keywords: fusion, Ordered Weighted Averaging (OWA), 
DSmT, uncertainty, information imperfection, multi- 


criteria decision making (MCDM) 


I. INTRODUCTION 
A. Decisions under certainty, risk or uncertainty 


Decision making in real-life situations are often difficult 
multi-criteria problems. In the classical Multi-Criteria De- 
cision Making (MCDM) framework, those decisions consist 
mainly in choosing, ranking or sorting alternatives, solutions 
or more generally potential actions [17] on the basis of 
quantitative or qualitative criteria. Existing methods differs on 
aggregation principles (total or partial), preferences weight- 
ing, and so on. In total aggregation multicriteria decision 
methods such as Analytic Hierarchy Process (AHP) [19], the 
result for an alternative is a unique value called synthesis 
criterion. Possible alternatives (A;) belonging to a given set 
A = {Aj, Ao,...,Aq} are evaluated according to preferences 
(represented by weights wj) expressed by the decision-makers 
on the different criteria (C4) (see figure 1). 

Decisions are often taken on the basis of imperfect infor- 
mation and knowledge (imprecise, uncertain, incomplete) pro- 
vided by several more or less reliable sources and depending 
on the states of the world: decisions can be taken in certain, 
risky or uncertain environment. In a MCDM context, decision 
under certainty means that the evaluations of the alternative 
are independent from the states of the world. In other cases, 
alternatives may be assessed differently depending on the 
scenarii that are considered. 


Originally published as Tacnet J.-M., Dezert J., Cautious 
OWA and Evidential Reasoning for Decision Making under 
Uncertainty, in Proc. Of Fusion 2011 Conf., Chicago, July, 
2011, and reprinted with permission. 
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Figure 1. Principle of a multi-criteria decision method based on a total 


aggregation principle. 


In the classical framework of decision theory under uncer- 
tainty, Expected Utility Theory (EUT) states that a decision 
maker chooses between risky or uncertain alternatives or 
actions by comparing their expected utilities [14]. Let us 
consider an example of decision under uncertainty (or risk) 
related to natural hazards management. On the lower parts of 
torrent catchment basin or an avalanche path, risk analysis 
consists in evaluating potential damage caused due to the 
effects of hazard (a phenomenon with an intensity and a 
frequency) on people and assets at risk. Different strategies 
(A;) are possible to protect the exposed areas. For each of 
them, damage will depend on the different scenarii (S;) of 
phenomenon which can be more or less uncertain. An action 
A; (e.g. building a protection device, a dam) is evaluated 
through its potential effects rọ to which are associated utilities 
u(r) (protection level of people, cost of protection, ...) and 
probabilities p(r,) (linked to natural events or states of nature 
Sp). The expected utility U(a) of an action a is estimated 
through the sum of products of utilities and probabilities of 
all potential consequences of the action a: 


U(Ai) =X u(re) - pre) (1) 
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When probabilities are known, decision is done under risk. 
When those probabilities becomes subjective, the prospect 
theory (subjective expected utility theory - SEUT) [12] can 
apply : 

e the objective utility (e.g. cost) u(r,) is replaced by a 

subjective function (value) denoted v(u(rx)) ; 

e the objective weighting p(r;,) is replaced by a subjective 

function 7(p(rx)). 


v(-) is the felt subjective value in response of the expected 
cost of the considered action, and 7(-) is the felt weighting 
face to the objective probability of the realisation of the result. 
Prospect theory shows that the function v(-) is asymmetric: 
loss causes a negative reaction intensity stronger than the pos- 
itive reaction caused by the equivalent gain. This corresponds 
to an aversion to risky choices in the area of earnings and a 
search of risky choices in the area of loss. 

In a MCDM context, information imperfection concerns 
both the evaluation of the alternatives (in any context of 
certainty, risk or ignorance) and the uncertainty or lack of 
knowledge about the possible states of the world. Uncertainty 
and imprecision in multi-criteria decision models has been 
early considered [16]. Different kinds of uncertainty can be 
considered: on the one hand the internal uncertainty is linked 
to the structure of the model and the judgmental inputs re- 
quired by the model, on the other hand the external uncertainty 
refers to lack of knowledge about the consequences about a 
particular choice. 


B. Objectives and goals 


Several decision support methods exist to consider both 
information imperfection, sources heterogeneity, reliability, 
conflict and the different states of the world when evaluating 
the alternatives as summarized on figure 2. A more complete 
review can be found in [28]. Here we just remind some 
recent examples of methods mixing MCDM approaches and 
Evidential Reasoning! (ER). 


Decision under certainty 


TA .__, Decision under uncertainty or ignorance 
Decision under risk 







Imperfection in 
preferences 
(importance) evaluation 


Imperfection of 
alternatives evaluation 
Alternative; 
(e.g. groups of alternatives, 
uncompleteness of alternatives, 
imprecise evaluations...) 


( different decision models, 
aggregation principles are used) 


x x 


MAUT, AHP, 


Classical \ 
Outranking methods.. 


methods (e.g.) 


Methods mixing ER 
and MCDM (e.g.) 


Expected Utility Theory (EUT) e recs ecu 









DS-AHP, DSmT-AHP, ER-MCDA... owa (2008) COWA-ER 


Figure 2. Information imperfection in the different decision support methods 


e Dempster-Shafer-based AHP (DS-AHP) has introduced 
a merging of Evidential Reasoning (ER) with Analytic 


'Evidential Reasoning refers to the use of belief functions as theoretical 
background, not to a specific theory of belief functions (BF) aimed for 
combining, or conditioning BF. Actually, Dempster-Shafer Theory (DST) [21], 
Dezert-Smarandache Theory (DSmT) [22], and Smets’ TBM [25] are different 
approaches of Evidential Reasoning. 


Hierachy Process (AHP) [19] to consider the imprecision 
and the uncertainty in evaluation of several alternatives. 
The idea is to consider criteria as sources [1], [3] and 
derive weights as discounting factors in the fusion process 
[5]; 

e Dezert-Smarandache-based (DSmT-AHP) [8] takes into 
account the partial uncertainty (disjunctions) between 
possible alternatives and introduces new fusion rules, 
based on Proportional Conflict Redistribution (PCR) prin- 
ciple, which allow to consider differences between impor- 
tance and reliability of sources [23]; 

e ER-MCDA [28], [29] is based on AHP, fuzzy sets theory, 
possibility theory and belief functions theory too. This 
method considers both imperfection of criteria evalua- 
tions, importance and reliability of sources. 


Introducing ignorance and uncertainty in a MCDM process 
consists in considering that consequences of actions (Aj) 
depend of the state of nature represented by a finite set 
S = {S1, S2,..., Sn}. For each state, the MCDM method 
provides an evaluation C;;. We assume that this evaluation 
Ci; done by the decision maker corresponds to the choice 
of A; when Sj occurs with a given (possibly subjective) 
probability. The evaluation matrix is defined as C = [C;;] 
where t =1,...,q andj =1,...,n. 


Sy Si Sn 
Ay (Cu Ci Cin 
A; Ci Gij Cin = C (2) 


Ag \Ca Can 

Existing methods using evidential reasoning and MCDM 
have, up to now, focused on the case of imperfect evaluation 
of alternatives in a context of decision under certainty. In 
this paper, we propose a new method for decision under 
uncertainty that mixes MCDM principles, decision under 
uncertainty principles and evidential reasoning. For this 
purpose, we propose a framework that considers uncertainty 
and imperfection for scenarii corresponding to the state of 
the world. 


This paper is organized as follows. In section II, we 
briefly recall the basis of DSmT. Section III presents two 
existing methods for MCDM under uncertainty using belief 
functions theory: DSmT-AHP as an extension of Saaty’s multi- 
criteria decision method AH P, and Yager’s Ordered Weighted 
Averaging (OWA) approach for decision making with belief 
structures. The contribution of this paper concerns the section 
IV where we describe an alternative to the classical OWA, 
called cautious OWA method, where evaluations of alternatives 
depend on more or less uncertain scenarii. The flexibility 
and advantages of this COWA method are also discussed. 
Conclusions and perspectives are given in section V. 
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II. BELIEF FUNCTIONS AND DSMT 


Dempster-Shafer Theory (DST) [21] offers a powerful math- 
ematical formalism (the belief functions) to model our belief 
and uncertainty on the possible solutions of a given problem. 
One of the pillars of DST is Dempster-Shafer rule (DS) of 
combination of belief functions. The purpose of the devel- 
opment of Dezert-Smarandache Theory (DSmT) [22] is to 
overcome the limitations of DST by proposing new underlying 
models for the frames of discernment in order to fit better 
with the nature of real problems, and new combination and 
conditioning rules for circumventing problems with DS rule 
specially when the sources to combine are highly conflicting. 
In DSmT, the elements 6;, i = 1,2,...,n of a given frame O 
are not necessarily exclusive, and there is no restriction on 6; 
but their exhaustivity. Some integrity constraints (if any) can 
be include in the underlying model of the frame. Instead of 
working in power-set 2°, we classically work on hyper-power 
set DE (Dedekind’s lattice) - see [22], Vol.1 for details and 
examples. A (generalized) basic belief assignment (bba) given 
by a source of evidence is a mapping m : DE — [0,1] such 


that 
5 m(A)=1 


AED® 


m(@)=0 and 


(3) 


The generalized credibility and plausibility functions are de- 
fined in almost the same manner as within DST, i.e. 


Bel(A)= X` m(B) and P(A)= X` m(B) ©) 
BCA BnA#O 
BeD® BeD® 


In this paper, we will work with Shafer’s model of the frame 
O, i.e. all elements 6; of © are assumed truly exhaustive and 
exclusive (disjoint). Therefore DE = 2° and the generalized 
belief functions reduces to classical ones. DSmT proposes 
a new efficient combination rules based on proportional 
conflict redistribution (PCR) principle for combining highly 
conflicting sources of evidence. Also, the classical pignistic 
transformation BetP(.) [26] is replaced by the by the more 
effective D.SmP(.) transformation to estimate the subjective 
probabilities of hypotheses for classical decision-making. We 
just recall briefly the PCR fusion rule # 5 (PCRS) and Dezert- 
Smarandache Probabilistic (DSmP) transformation. All details, 
justifications with examples on PCRS and DSmP can be found 
freely from the web in [22], Vols. 2 & 3 and will not be 
reported here. 

e The Proportional Conflict Redistribution Rule no. 5: 
PCRS is used generally to combine bba’s in DSmT framework. 
PCRS transfers the conflicting mass only to the elements 
involved in the conflict and proportionally to their individual 
masses, so that the specificity of the information is entirely 
preserved in this fusion process. Let mı(.) and mo(.) be 
two independent” bba’s, then the PCRS rule is defined as 
follows (see [22], Vol. 2 for full justification and examples): 
mpcrs(Q) = 0 and VX € 2° \ 10} 


?i.e. each source provides its bba independently of the other sources. 


mpcrs(X)= XO mai(Xi)m2(X2)+ 





X1,X2€29 
X1nX2=X 
5 mı (X)?m2(X2) i m(X)?m1(X2) (5) 
secre my(X) + m2(X2) mə(X) + mı(X2) 
Xenx=0 


where all denominators in (5) are different from zero. If a 
denominator is zero, that fraction is discarded. Additional 
properties of PCR5 can be found in [9]. Extension of PCRS 
for combining qualitative bba’s can be found in [22], Vol. 2 & 
3. All propositions/sets are in a canonical form. A variant of 
PCRS, called PCR6 has been proposed by Martin and Osswald 
in [22], Vol. 2, for combining s > 2 sources. The general 
formulas for PCR5 and PCR6 rules are given in [22], Vol. 2 
also. PCR6 coincides with PCR5 for the fusion of two bba’s. 

e DSmP probabilistic transformation: DSmP is a serious 
alternative to the classical pignistic transformation BetP since 
it increases the probabilistic information content (PIC), i.e. 
it reduces Shannon entropy of the approximate subjective 
probability measure drawn from any bba — see [22], Vol. 3, 
Chap. 3 for details and the analytic expression of DSmP.(.). 
When e > 0 and when the masses of all singletons are 
zero, DSmP.(.) = BetP(.), where the well-known pignistic 
transformation BetP(.) is defined by Smets in [26]. 

In the Evidential Reasoning framework, the decisions are 
usually achieved by computing the expected utilities of the acts 
using either the subjective/pignistic Bet P{.} (usually adopted 
in DST framework) or DSmP(.) (as suggested in DSmT 
framework) as the probability function needed to compute 
expectations. Usually, one uses the maximum of the pignistic 
probability as decision criterion. The maximum of Bet P{.} is 
often considered as a balanced strategy between the two other 
strategies for decision making: the max of plausibility (opti- 
mistic strategy) or the max. of credibility (pessimistic strat- 
egy). The max of DSmP(.) is considered as more efficient 
for practical applications since DSmP(.) is more informative 
(it has a higher PIC value) than BetP(.) transformation. The 
justification of DSmP as a fair and useful transformation for 
decision-making support can also be found in [10]. Note that 
in the binary frame case, all the aforementioned decision 
strategies yields same final decision. 


II. BELIEF FUNCTIONS AND MCDM 


Two simple methods for MCDM under uncertainty are 
briefly presented: DSmT-AHP approach and Yager’s OWA 
approach. The new Cautious OWA approach that we propose 
will be developed in the next section. 


A. DSmT-AHP approach 


DSmT-AHP aimed to perform a similar purpose as AHP 
[18], [19], SMART [30] or DS/AHP [1], [3], etc. that is to find 
the preferences rankings of the decision alternatives (DA), or 
groups of DA. DSmT-AHP approach consists in three steps: 


e Step 1: we extend the construction of the matrix for taking 
into account the partial uncertainty (disjunctions) between 
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possible alternatives. If no comparison is available be- 
tween elements, then the corresponding elements in the 
matrix is zero. Each bba related to each (sub-) criterion 
is the normalized eigenvector associated with the largest 
eigenvalue of the “uncertain” knowledge matrix (as done 
in standard AHP approach). 

e Step 2: we use the DSmT fusion rules, typically the PCRS 
rule, to combine bba’s drawn from step 1 to get a final 
MCDM priority ranking. This fusion step must take into 
account the different importances (if any) of criteria as it 
will be explained in the sequel. 

e Step 3: decision-making can be based either on the 
maximum of belief, or on the maximum of the plausibility 
of DA, as well as on the maximum of the approximate 
subjective probability of DA obtained by different prob- 
abilistic transformations. 


The MCDM problem deals with several criteria having 
different importances and the classical fusion rules cannot be 
applied directly as in step 2. In AHP, the fusion is done from 
the product of the bba’s matrix with the weighting vector of 
criteria. Such AHP fusion is nothing but a simple component- 
wise weighted average of bba’s and it doesn’t actually process 
efficiently the conflicting information between the sources. It 
doesn’t preserve the neutrality of a full ignorant source in 
the fusion. To palliate these problems, we have proposed a 
new solution for combining sources of different importances 
in [23]. Briefly, the reliability of a source is usually taken into 
account with Shafer’s discounting method [21] defined by: 


ees =a-m(X), forX #0 


Ma(Q) =a: m(O) + (1—a) (6) 


where a € [0;1] is the reliability discounting factor. a = 1 
when the source is fully reliable and a = 0 if the source is 
totally unreliable. We characterize the importance of a source 
by an importance factor 6 in [0,1]. 6 factor is usually not 
related with the reliability of the source and can be chosen 
to any value in [0,1] by the designer for his/her own reason. 
By convention, 6 = 1 means the maximal importance of the 
source and 8 = 0 means no importance granted to this source. 
From this 8 factor, we define the importance discounting by 


ae =B-m(X), forX #0 


mp(0) = 6-m() + (1-8) K 


Here, we allow to deal with non-normal bba since mg(@) > 0 
as suggested by Smets in [24]. This new discounting preserves 
the specificity of the primary information since all focal ele- 
ments are discounted with same importance factor. Based on 
this importance discounting, one can adapt PCR5 (or PCR6) 
rule for N > 2 discounted bba’s mg,i(.), i = 1,2,...N to 
get with PCR5g fusion rule (see details in [23]) a resulting 
bba which is then normalized because in the AHP context, 
the importance factors correspond to the components of the 
normalized eigenvector w. It is important to note that such 
importance discounting method cannot be used in DST when 
using Dempster-Shafer’s rule of combination because this rule 


is not responding to the discounting of sources towards the 
empty set (see Theorem 1 in [23] for proof). The reliability 
and importance of sources can be taken into account easily 
in the fusion process and separately. The possibility to take 
them into account jointly is more difficult, because in general 
the reliability and importance discounting approaches do not 
commute, but when a; = 8; = 1. In order to deal both with 
reliabilities and importances factors and because of the non 
commutativity of these discountings, two methods have also 
been proposed in [23] and not reported here. 


B. Yager’s OWA approach 


Let’s introduce Yager’s OWA approach [33] for decision 
making with belief structures. One considers a collection of q 
alternatives belonging to a set A = {Aj, Ao,...,A,} and 
a finite set S = {5),59,...,5,} of states of the nature. 
We assume that the payoff/gain C;j of the decision maker 
in choosing A; when S} occurs are given by positive (or null) 
numbers. The payoffs q x n matrix is defined by C = [C;,] 
where i = 1,...,q and j = 1,...,n as in eq. (2). The 
decision-making problem consists in choosing the alternative 
A* € A which maximizes the payoff to the decision maker 
given the knowledge on the state of the nature and the payoffs 
matrix C. A* € A is called the best alternative or the 
solution (if any) of the decision-making problem. Depending 
the knowledge the decision-maker has on the states of the 
nature, he/she is face on different decision-making problems: 
1 - Decision-making under certainty: only one state of 
the nature is known and certain to occur, say S;. Then the 
decision-making solution consists in choosing A* = A;» with 
i* = arg max;{Ci;}. 

2 — Decision-making under risk: the true state of the nature 
is unknown but one knows all the probabilities p; = P(5;), 
j = 1,...,n of the possible states of the nature. In this 
case, we use the maximum of expected values for decision- 
making. For each alternative A;, we compute its expected 
payoff E[C;] = 32; pj - Cij, then we choose A* = Aj« with 
i* 5 arg max;{E[C;]}. 

3 - Decision-making under ignorance: one assumes no 
knowledge about the true state of the nature but that it belongs 
to S. In this case, Yager proposes to use the OWA operator 
assuming a given decision attitude taken by the decision- 
maker. Given a set of values/payoffs c1, C2, ..., Cn, OWA con- 
sists in choosing a normalized set of weighting factors W = 


[w1,W2,-..Wn] where w; € [0, 1] and >), wj = 1 and for any 
set of values c1, Ca, ..., Cn compute OWÀ (c1, C2, . . . , Cn) as 
OWA (c1, C2,- .., Cn) = Sou; +b; (8) 
j 


where b; is the jth largest element in the collection cy, c2, ..., 
Cn. As seen in (8), the OWA operator is nothing but a simple 
weighted average of ordered values of a variable. 

Based on such OWA operators, the idea consists for each 
alternative A;, i = 1,...,q to choose a weighting vector 
W; = [wi1, Wiz, .-. Win] and compute its OWA value V; + 
OWA (Cin, Ciz,- -., Cin) = X wij - bij where b;; is the 
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jth largest element in the collection of payoffs Cii, Cig,..., 
Cin. Then, as for decision-making under risk, we choose 
A* = Aj» with i* £ arg max;{V;}. The determination of W; 
depends on the decision attitude taken by the decision-maker. 
The pessimistic attitude considers for all 7 = 1,2,...,4, 
W; = [0,0,...,0,1]. In this case, we assign to A; the least 
payoff and we choose the best worst (the max of least payoffs). 
It is a Max-Min strategy since i* = arg max,(min,; Ciz). 
The optimistic attitude considers for all 7 = 1,2,...,4, 
W; = [1,0,...,0,0]. We commit to A; its best payoff and 
we select the best best. It is a Max-Max strategy since 
i* = arg max;(max; C;,;). Between these two extreme atti- 
tudes, we can define an infinity of intermediate attitudes like 
the normative/neutral attitude (when or all ¿ = 1,2,...,4, 
W; = [1/n,1/n,...,1/n,1/n]) which corresponds to the 
simple arithmetic mean, or Hurwicz attitude (i.e. a weighted 
average of pessimistic and optimistic attitudes), etc. To justify 
the choice of OWA method, Yager defines an optimistic index 
a € [0,1] from the components of W; and proposes to 
compute (by mathematical programming) the best weighting 
vector W; corresponding to a priori chosen optimistic index 
and having the maximal entropy (dispersion). If a = 1 
(optimistic attitude) then of course W; = [1,0,...,0,0] and 
if a = 0 (pessimistic attitude) then W; = [0,0,...,0,1]. I 
theory, Yager’s method doesn’t exclude the possibility to adopt 
an hybrid attitude depending on the alternative we consider. In 
other words, we are not forced to consider the same weighting 
vectors for all alternatives. 

Example 1: Let’s take states S = {S1, S2, S3, S4}, alterna- 
tives A = {Aj, Ao, A3} and the payoffs matrix: 


Sı S2 S3 S4 
A; /10 0 20 30 
A| 1 10 20 30 (9) 


A3 \30 10 2 5 


If one adopts the pessimistic attitude in choosing Wy = 
W2 = W = [0,0,0,1], then one gets for each alterna- 
tive A;, i = 1,2,3 the following values of OWA’s: Vi = 
OWA(10, 0, 20,30) = 0, V2 = OWA(1, 10, 20,30) = 1 and 
V3 = OWA(30, 10, 2,5) = 2. The final decision will be the 
alternative V3 since it offers the best expected payoff. 

If one adopts the optimistic attitude in choosing Wy = 
W2 = W3 = [1,0,0,0], then one gets for each alterna- 
tive A;, i = 1,2,3 the following values of OWA’s: Vi = 
OWA(10, 0, 20, 30) = 30, V> = OWA(1, 10, 20, 30) = 30 and 
V3 = OWA(30, 10, 2,5) = 30. All alternatives offer the same 
expected payoff and thus the final decision must be chosen 
randomly or purely ad-hoc since there is no best alternative. 

If one adopts the normative attitude in choosing Wı = 
W2 = W3 = [1/4,1/4,1/4,1/4] Ge. one assumes that 
all states of nature are equiprobable), then one gets: Vı = 
OWA(10, 0, 20,30) = 60/4, V> = OWA(1, 10, 20,30) = 
61/4 and V3 = OWA(30, 10, 2,5) = 47/4. The final decision 
will be the alternative V2 since it offers the best expected 
payoff. 


4 - Decision-making under uncertainty: this corresponds 
to the general case where the knowledge on the states of 
the nature is characterized by a belief structure. Clearly, one 
assumes that a priori knowledge on the frame S of the different 
states of the nature is given by a bba m(.) : 29 — [0,1]. This 
case includes all previous cases depending on the choice of 
m/(.). Decision under certainty is characterized by m(S;) = 1; 
Decision under risk is characterized by m(s) > 0 for some 
states s € S; Decision under full ignorance is characterized 
by m(S1U S2U...U Sn) = 1, etc. Yager’s OWA for decision- 
making under uncertainty combines the schemes used for 
decision making under risk and ignorance. It is based on the 
derivation of a generalized expected value C; of payoff for 
each alternative A; as follows: 


Ci = X m( Xx) Viz (10) 
k=1 


where r is the number of focal elements of the belief structure 
(S,m/(.)). m(X;,) is the mass of belief of the focal element 
Xp € 25, and Vi, is the payoff we get when we select 
A; and the state of the nature lies in X;. The derivation 
of Vik is done similarly as for the decision making under 
ignorance when restricting the states of the nature to the subset 
of states belonging to Xx only. Therefore for A; and a focal 
element X;, instead of using all payoffs C;;, we consider 
only the payoffs in the set Mi, = {Cj;|S; € Xx} and 
Vir = OWA(M;;,) for some decision-making attitude chosen 
a priori. Once generalized expected values Ci, i = 1,2,...,q 
are computed, we select the alternative which has its highest 
C; as the best alternative (i.e. the final decision). The principle 
of this method is very simple, but its implementation can be 
quite greedy in computational resources specially if one wants 
to adopt a particular attitude for a given level of optimism, 
specially if the dimension of the frame S is large: one needs to 
compute by mathematical programming the weighting vectors 
generating the optimism level having the maximum of entropy. 
As illustrative example, we take Yager’s example? [33] with 
a pessimistic, optimistic and normative attitudes. 

Example 2: Let’s take states S = {$1, S2, S3, S4, S5 } with 
associated bba m(S1 U S3 U S4) = 0.6, m(S2 U S5) = 0.3 
and m(S1 U S2 U S3 U S4 U S5) = 0.1. Let’s also consider 
alternatives A = {4;, A2, A3, Aa} and the payoffs matrix: 


7 5 12 13 6 
12 10 5 11 2 

C=|9 13 3 10 9 (1) 
6 9 11 15 4 


The r = 3 focal elements of m(.) are X = S1 U S3 U S4, 
Xə = S2 U S5 and X3 = Sı U S2 U S3 U S4 U Ss. Xı and 
Xə are partial ignorances and X3 is the full ignorance. One 
considers the following submatrix (called bags by Yager) for 


3There is a mistake/typo error in original Yager’s example [33]. 
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the derivation of V;,, for i = 1,2,3,4 and k = 1,2,3. 


Mı 7 12 13 
o |Ma [12 5 n 
MAS M3;} 19 3 10 
Muy 6 11 15 
Mie 5 6 
_ |MM| _ |10 2 
M(X) = Mz) 113 9 
Maz 9 4 
Mis 7 5 12 13 6 
|Ma] |12 10 5 un 2|_ 
M(X) = |m 59 13 3 10 9| 5C 
Ma3 6 9 11 15 4 


e Using pessimistic attitude, and applying the OWA op- 
erator on each row of M(X;,) for k = 1 to r, one 
gets finally*: V(X1) = [Vi1, Vai, Vai, Vail’ = [7,5,3, 6f, 
V(X2) = [Viz, V22, V32, Vaz)” = [5,2,9,4]° and V(X3). = 
(Vis, V23, V33, Vas]’ = [5,2,3,4]. Applying formula (10) 
for 7 = 1,2,3,4 one gets finally the following generalized 
expected values using vectorial notation: 


r=3 
[Ci, C2, C3, Ca)’ = XO m(Xx)-V(Xx) = (6.2, 3.8, 4.8, 5.2)’ 
k=1 
According to these values, the best alternative to take is Ay, 
since it has the highest generalized expected payoff. 

e Using optimistic attitude, one takes the max value of each 
row, and applying OWA on each row of M(X;) for k = 1 to 
r, one gets: V(X1) = [Vi1, V21, V31, Vail” = [13, 12, 10, 15)", 
V(X2) = [Vi2, V22, Va2, Vaz)’ = [6, 10, 13, 9]’, and V(X3) = 
[Vis, V23, V33,Va3]’ = [13,12,13,15]’. One finally gets 
[C1, C2, C3, Ca]’ = [10.9, 11.4, 11.2, 13.2] and the best al- 
ternative to take with optimistic attitude is A, since it has the 
highest generalized expected payoff. 

e Using normative attitude, one takes Wy = W = 
W3 = Wa = [1/|Xxl, 1/|Xzxl, ETE 1/|X;|] where |X] is the 
cardinality of the focal element X;, under consideration. The 
number of elements in W; is equal to |X;,|. The generalized 
expected values are [C1, C2, C3, Cul’ = [9.1,8.3, 8.4, 9.4)" 
and the best alternative with the normative attitude is A4 (same 
as with optimistic attitude) since it has the highest generalized 
expected payoff. 


C. Using expected utility theory 


In this section, we propose to use a much simpler ap- 
proach than OWA Yager’s approach for decision making under 
uncertainty. The idea is to approximate the bba m/(.) by a 
subjective probability measure through a given probabilistic 
transformation. We suggest to use either BetP or (better) 
DSmP transformations for doing this as explained in [22] 
(Vol.3, Chap. 3). Let’s take back the previous example and 
compute the BetP(.) and DSmP.(.) values from m(.). 


4where X+ denotes the transpose of X. 


One gets the same values in this particular example for any 
€ > 0 because we don’t have singletons as focal elements of 
m/(.), which is normal. Here BetP(S|) = DSmP(S)) = 
0.22, BetP(S2) = DSmP(S2) = 0.17, BetP(S3) = 
DSmP(S3) = 0.22, BetP(S4) = DSmP(S4) = 0.22 
and BetP(S;) = DSmP(S2) = 0.17. Based on these 
probabilities, we can compute the expected payoffs for each 
alternative as for decision making under risk (e.g. for C1, we 
get 7 -0.22 + 5- 0.17 + 12- 0.22 + 13 -0.22 + 6 - 0.17 = 8.91). 
For the 4 alternatives, we finally get: 


Epgetp|C] = Epsmp|C] = [8.91, 8.20, 8.58, 9.25]° 


According to these values, one sees that the best alternative 
with this pignistic or DSm attitude is A, (same as with 
Yager’s optimistic or normative attitudes) since it offers the 
highest pignistic or DSm expected payoff. This much simpler 
approach must be used with care however because there is a 
loss of information through the approximation of the bba m(.) 
into any subjective probability measure. Therefore, we do not 
recommend to use it in general. 


IV. THE NEW COWA-ER APPROACH 


Yager’s OWA approach is based on the choice of given 
attitude measured by an optimistic index in [0,1] to get the 
weighting vector W. How is chosen such an index/attitude ? 
This choice is ad-hoc and very disputable for users. What to 
do if we don’t know which attitude to adopt ? The rational 
answer to this question is to consider the results of the two 
extreme attitudes (pessimistic and optimistic ones) jointly and 
try to develop a new method for decision under uncertainty 
based on the imprecise valuation of alternatives. This is the 
approach developed in this paper and we call it Cautious OWA 
with Evidential Reasoning (COWA-ER) because it adopts the 
cautious attitude (based on the possible extreme attitudes) and 
ER, as explained in the sequel. 

Let’s take back the previous example and take the pes- 
simistic and optimistic valuations of the expected payoffs. 
The expected payoffs E[C;] are imprecise since they belong 
to interval [C™™,C™**] where bounds are computed with 
extreme pessimistic and optimistic attitudes, and one has 


peN [62 te 

gio = | EIC [3.8; 11.4] 
Cl = iiel © as iLa 
E[C4] (5.2; 13.2] 





Therefore, one has 4 sources of information about the 
parameter associated with the best alternative to choose. 
For decision making under imprecision, we propose to use 
here again the belief functions framework and to adopt the 
following very simple COWA-ER methodology based on the 
following four steps: 

e Step 1: normalization of imprecise values in [0, 1]; 

e Step 2: conversion of each normalized imprecise value 

into elementary bba m;(.); 

e Step 3: fusion of bba m,(.) with an efficient combination 

tule (typically PCR5); 
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e Step 4: choice of the final decision based on the resulting 

combined bba. 

Let’s describe in details each step of COWA-ER. In step 1, 
we divide each bound of intervals by the max of the bounds 
to get a new normalized imprecise expected payoff vector 
E!™{C]. In our example, one gets: 


[6.2/13.2; 10.9/13.2] (0.47; 0.82] 
gimejc) = [3.8/13.2; 11.4/13.2] | _ | (0.29; 0.86] 
[4.8/13.2; 11.2/13.2] (0.36; 0.85] 
[5.2/13.2; 13.2/13.2] (0.39; 1.00] 


In step 2, we convert each imprecise value into its bba 
according to a very natural and simple transformation [7]. 
Here, we need to consider as frame of discernment, the finite 
set of alternatives © = {Aj, Ao, A3, A4} and the sources 
of belief associated with them obtained from the normalized 
imprecise expected payoff vector E/™?|C]. The modeling for 
computing a bba associated to the hypothesis A; from any 
imprecise value [a;b] C [0;1] is very simple and is done as 
follows: 

m,;(A;) =a, 

m;(A;) =1—b 

mi(A; U Aj) = m; (©) =b-a 


(12) 


where A; is the complement of A; in ©. With such simple 
conversion, one sees that Bel(A;) = a, PI(A;) = b. The 
uncertainty is represented by the length of the interval [a;b] 
and it corresponds to the imprecision of the variable (here the 
expected payoff) on which is defined the belief function for 
A;. In the example, one gets: 





Alternatives A; | mil Ai) | Mi (Ai) | Mi (Ai U Ai) 
Aj 0.47 0.18 0.35 
Ag 0.29 0.14 0.57 
A3 0.36 0.15 0.49 
A4 0.39 0 0.61 
Table I 


BASIC BELIEF ASSIGNMENTS OF THE ALTERNATIVES 


In step 3, we need to combine bba’s m;(.) by an efficient 
rule of combination. Here, we suggest to use the PCRS rule 
proposed in DSmT framework since it has been proved very 
efficient to deal with possibly highly conflicting sources of 
evidence. PCRS has been already applied successfully in all 
applications where it has been used so far [22]. We call 
this COWA-ER method based on PCRS as COWA-PCRS. 
Obviously, we could replace PCRS rule by any other rule (DS 
rule, Dubois& Prade, Yager’s rule, etc and thus define easily 
COWA-DS, COWA-DP, COWA-Y, etc variants of COWA- 
ER. This is not the purpose of this paper and this has no 
fundamental interest in this presentation. The result of the 
combination of bba’s with PCRS for our example is given 
in of Table II. 

The last step 4 is the decision-making from the resulting bba 
of the fusion step 3. This problem is recurrent in the theory 
of belief functions and several attitudes are also possible as 








Focal Element mpcrs(.) 
Al 0.2488 
Ag 0.1142 
A3 0.1600 
Aa 0.1865 
AiU Aa 0.0045 
Ag U A4 0.0094 
Ay U Ag U Ag 0.0236 
A3 U A4 0.0075 
A, U A3 U Ag 0.0198 
Ag U A3 U Ag 0.0374 
A, U A2 U A3 U Ag 0.1883 

Table H 


FUSION OF THE FOUR ELEMENTARY BBA’ S WITH PCR5 


explained at the end of section II. Table III shows what are 
the values of credibilities, plausibilities, BetP and DSmP;=0 
for each alternative in our example. 





Ai 0.2488 0.3126 0.3364 0.4850 

Ag 0.1142 0.1863 0.1623 0.3729 

A3 0.1600 0.2299 0.2242 0.4130 

A4 0.1865 0.2712 0.2771 0.4521 
Table IM 


CREDIBITITY AND PLAUSIBILITY OF A; 


Based on the results of Table III, it is interesting to note 
that, in this example, there is no ambiguity in the decision 
making whatever the attitude is taken by the decision-maker 
(the max of Bel, the max of Pl, the max of BetP or the max of 
DSmP), the decision to take will always be A1. Such behavior 
is probably not general in all problems, but at least it shows 
that in some cases like in Yager’s example, the ambiguity in 
decision can be removed when using COWA-PCRS instead of 
OWA which is an advantage of our approach. It is worth to 
note that Shannon entropy of BetP is H getp = 1.9742 bits is 
bigger than Shannon entropy of DSmP is Hpgmp = 1.9512 
bits which is normal since DSmP has been developed for 
increasing the PIC value. 

Advantages and extension of COWA-ER: COWA-PCR5 
allows also to take easily a decision, not only on a single alter- 
native, but also if one wants on a group/subset of alternatives 
satisfying a min of credibility (or plausibility level) selected by 
the decision-maker. Using such approach, it is of course very 
easy to discount each bba m;(.) entering in the fusion process 
using reliability or importance discounting techniques which 
makes this approach more appealing and flexible for the user 
than classical OWA. COWA-PCRS is simpler to implement 
because it doesn’t require the evaluation of all weighting 
vectors for the bags by mathematical programming. Only 
extreme and very simple weighting vectors [1,0,...,0] and 
[0,...,0, 1] are used in COWA-ER. Of course, COWA-ER can 
also be extended directly for the fusion of several sources of 
informations when each source can provide a payoffs matrix. It 
suffices to apply COWA-ER on each matrix to get the bba’s of 
step 3, then combine them with PCRS (or any other rule) and 
then apply step 4 of COWA-ER. We can also discount each 
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source easily if needed. All these advantages makes COWA- 
ER approach very flexible and appealing for MCDM under 
uncertainty. In summary, the original OWA approach considers 
several alternatives A; evaluated in the context of different 
uncertain scenarii and includes several ways (pessimistic, 
optimistic, hurwicz, normative) to interpret and aggregate the 
evaluations with respect to a given scenario. COWA-ER uses 
simultaneously the two extreme pessimistic and optimistic 
decision attitudes combined with an efficient fusion rule as 
shown on Figure 3. In order to save computational resources 
(if required), we also have proposed a less efficient OWA 
approach using the classical concept of expected utility based 
on DSmP or BetP. 


Decision to take : which alternative shall we choose ? 


State of nature (Scenarios) . $ 
Sı > §, +. §, Ordered weighted averaging (OWA) operator 


Wwi\ y pe Optimistic choice 


(Yager,2008) 








po AD /Cu > Cn Cin 


_—+ Pessimistic choice 





Cie — Oc. ce o | lw; m ‘ 
C Alternative; >> 1 a i Hurwicz choice 








Alternative, > A) Ca o Cu e Cm) Wa ~~ Normative choice 
payoffs : results of the evaluation of alternative for each scenario 
TWO EVOLUTIONS in the DSmT context: 
Osw = {S1 , Sj, Sn} « Subjective » probabilities or Beliefs 
B, € 295w (unknown, uncomplete knowledge of scenarios) 
E2 





Be ocr Be e Be 
Ai [Cu Cu Cu BetP(S;) Use of subjective 

©) 3 probabilities to calculate an 
a, | Ca c expected payoff E[C;] for 

oc DSmP(S;) each alternative 

eB Ag \COn = Oy ee Can (DSmT) 











COWA: 


A;— E|C;] belonging to an merafong) 
Pessimistic choice Optimistic choice 


O= {A1 Aj An} 


' 
bba 





Fusion 
(DSmT, PCR rule) 








Figure 3. COWA-ER: Two evolutions of Yager’s OWA method. 


V. CONCLUSION 


In this work, Yager’s Ordered Weighted Averaging (OWA) 
operators are extended and simplified with evidential reasoning 
(ER) for MCDM under uncertainty. The new Cautious OWA- 
ER method is very flexible and requires less computational 
load than classical OWA. COWA-ER improves the existing 
framework for MCDM since it can deal also with several 
more or less reliable sources. Further developments are now 
planned to combine uncertainty about states of the world with 
the imperfection and uncertainty of alternatives evaluations 
as previously introduced in the ER-MCDA and DSmT-AHP 
methods in order to connect them with COWA-ER. 


REFERENCES 


[1] M. Beynon, B. Curry, PH. Morgan, The Dempster-Shafer theory of 

evidence: An alternative approach to multicriteria decision modelling, 

Omega, Vol. 28, No. 1, pp. 37-50, 2000. 

[2] M. Beynon, D. Cosker, D. Marshall, An expert system for multi-criteria 

decision making using Dempster-Shafer theory, Expert Syst. with Appl. 

Vol. 20, No. 4, pp. 357-367, 2001. 

[3] M. Beynon, DS/AHP method: A mathematical analysis, including an 

understanding of uncertainty, Eur. J. of Oper. Research, Vol. 140, pp. 

148-164, 2002. 

[4] M. Beynon, Understanding local ignorance and non-specificity within 
the DS/AHP method of multi-criteria decision making, Eur. J. of Oper. 
Research, Vol. 163, pp. 403-417, 2005. 





[6 


[7 


[8 





[9 


[10 


[ll 


[12 
[13 
[14 
[15 


[16 


[17 


oo 


u 
[19 
[20 
[21 


[22 


[23 


[24 
[25 
[26 


[27 


[28 


[29 


[30 
[31 
[32 
[33 


[34 


108 





M. Beynon, A method of aggregation in DS/AHP for group decision- 
making with non-equivalent importance of individuals in the group, 
Comp. and Oper. Research, No. 32, pp. 1881-1896, 2005. 

D. Bouyssou, Modelling inaccurate determination, uncertainty, impre- 
cision using multiple criteria, Lecture Notes in Econ. & Math. Syst., 
335:78-87, 1989. 

J. Dezert,Autonomous navigation with uncertain reference points using 
the PDAF, in Multitarget-Multisensor Tracking, Vol 2, pp 271-324, Y. 
Bar-Shalom Editor, Artech House, 1991. 

J. Dezert, J.-M. Tacnet, M. Batton-Hubert, F. Smarandache,Multi-criteria 
decision making based on DSmT-AHP in Proc. of Belief 2010 Int. 
Workshop, Brest, France, 1-2 April, 2010. 

J. Dezert, F. Smarandache, Non Bayesian conditioning and decondition- 
ing, in Proc. of Belief 2010 Int. Workshop, Brest, France, 2010. 

D. Han, J. Dezert, C. Han, Y. Yang, Is Entropy Enough to Evaluate the 
Probability Transformation Approach of Belief Function?, in Proceedings 
of Fusion 2010 conference, Edinburgh, UK, July 2010. 

Z. Hua, B. Gong, X. Xu, A DS-AHP approach for multi-attribute 
decision making problem with incomplete information, Expert Systems 
with Appl., 34(3):2221-2227, 2008. 

D. Kahneman, A. Tversky, Prospect theory : An analysis of decision 
under risk, Econometrica, 47:263—-291, 1979. 

A. Martin, A.-L. Jousselme, C. Osswald, Conflict measure for the 
discounting operation on belief functions, Proc. of Fusion 2008 Int. Conf. 

P. Mongin, Expected utility theory, Handbook. of Economic Methodol- 
ogy, pp. 342-350, Edward Elgar, London, 1997. 

M. S. Ozdemir,T. L. Saaty, The unknown in decision making: What to 
do about it ? Eur. J.of Oper. Research, 174(1):349-359, 2006. 

B. Roy, Main sources of inaccurate determination, uncertainty and 
imprecision in decision models, Math. & Comput. Modelling, 12 (10- 
11):1245-1254, 1989. 

B. Roy, Paradigms and challenges, in Multiple Criteria Decision Anal- 
ysis : State of the art surveys, Vol.78 of Int. Series in Oper. Research 
and& Management Sci. (Chap. 1), pp. 1-24, Springer, 2005. 

T.L. Saaty, A scaling method for priorities in hierarchical structures, J. 
of Math. Psych., Vol. 15, PP. 59-62, 1977. 

T.L. Saaty, The Analytical Hierarchy Process, McGraw Hill, 1980. 

T.L. Saaty, Fundamentals of decision making and priority theory with 
the analytic hierarchy process, Vol. VI of the AHP series, RWL Publ., 
Pittsburgh, PA, USA,2000. 

G. Shafer, A Mathematical Theory of Evidence, Princeton Univ. Press, 
1976. 

F. Smarandache, J. Dezert (Editors), Advances and Applications of DSmT 
for Information Fusion, American Research Press, Rehoboth, Vol.1-3, 
2004-2009 - see http://fs.gallup.unm.edu//DSmT.htm. 

F. Smarandache, J. Dezert, J.-M. Tacnet, Fusion of sources of evidence 
with different importances and reliabilities, in Proc. of Fusion 2010 Int. 
Conf., Edinburgh, UK, July 2010. 

P. Smets, The Combination of Evidence in the Transferable Belief Model, 
IEEE Trans. PAMI 12, pp. 447-458, 1990. 

P. Smets, R. Kennes, The transferable belief model, Artif. Intel., 66(2), 
pp. 191-234, 1994. 

Ph. Smets, Decision making in the TBM: the necessity of the pignistic 
transformation, Int. J. of Approx. Reas., Vol. 38, pp. 133-147, 2005. 

T. J. Stewart, Dealing with uncertainties in MCDA, in Multiple Criteria 
Decision Analysis: State of the art surveys, Vol.78 of Int. Series in Op. 
Res. & Manag. Sci. (chapter 11), pp. 445-466, Springer, 2005. 

J.-M. Tacnet, M. Batton-Hubert, J. Dezert, Information fusion for natural 
hazards in mountains in [22], Vol. 3, 2009. 

J.-M. Tacnet, M. Batton-Hubert, J. Dezert, A two-step fusion process for 
multi-criteria decision applied to natural hazards in mountains, Proc. of 
Belief 2010 Int. Workshop, Brest, France, 2010. 

D. Von Winterfeldt, W. Edwards Decision analysis and behavioral 
research, Cambridge Univ. Press, 1986. 

R. Yager, On ordered weighted averaging operators in multi-criteria 
decision making, EEE Trans. on SMC, 18:183-190, 1988 

R. Yager Induced ordered weighted averaging operators, IEEE Trans. 
on SMC, Part B: Cybernetics, Vol. 29, No. 2, pp:141-150, April, 1999. 

R. Yager, Decision making under Dempster-Shafer uncertainties, Studies 
in Fuzziness and Soft Computing, 219:619-632, 2008. 

L. Zadeh, On the validity of Dempster’s rule of combination, Memo 
M79/24, Univ. of California, Berkeley, USA, 1979. 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


Sigmoidal Model for Belief Function- 
Based Electre Tri Method 


Jean Dezert 
Jean-Marc Tacnet 


Originally published as Dezert J., Tacnet J.-M., Sigmoidal Model 
for Belief Function-based Electre Tri method, in Belief 2012, 
Compiégne, May 2012, and reprinted with permission. 


Abstract. Main decision-making problems can be described into choice, ranking or 
sorting ofa set of alternatives or solutions. The principle of Electre TRI (ET) method 
is to sort alternatives a; according to criteria g; into categories C} whose lower and 
upper limits are respectively bp and ba}+1. The sorting procedure is based on the 
evaluations of outranking relations based frstly on calculation of partial concor- 
dance and discordance indexes and secondly on global concordance and credibility 
indexes. In this paper, we propose to replace the calculation of the original concor- 
dance and discordance indexes of ET method by a more effective sigmoidal model. 
Such model is part of a new Belief Function ET (BF-ET) method under development 
and allows a comprehensive, elegant and continuous mathematical representation of 
degree of concordance, discordance and the uncertainty level which is not directly 
taken into account explicitly in the classical Electre Tri. 


1 Introduction 


The Electre Tri (ET) method, developed by Yu [13], remains one of the most suc- 
cessful and applied methods for multiple criteria decision aiding (MCDA) sorting 
problems [5]. ET method assigns a set of given alternatives a; € A,i=1,2,...,n ac- 
cording to criteria g;, j = 1,2,...,mtoapre-define (and ordered) set of categories 
Cy € C,h =1,2,...,p+1 whose lower and upper limits are respectively b} and bh+1 
for all h = 1,...,p), with bo < bı < b2 <... < bn—1 < bn <... < bp. The assign- 
ment of an alternative a; to a category C; (limited by profile b} and by} + 1 ) consists 
in four steps involving at frst the computation of global concordance c(a;, bp) and 
discordance d (a;, bp) indexes! (steps 1 & 2), secondly their fusion into a credibility 





' Themselves computed from partial concordance and discordance indexes based on a given 


set criteria g;(.), j €J. 


109 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


index p(a;,b,) (step 3), and fnally the decision and choice of the category based 
on the evaluations of outranking relations [13, 6] (step 4). The partial concordance 
index c;(a;,b;,) measures the concordance of a; and b; in the assertion ”a; is at least 
as good as b”. The partial discordance index dj(a;,b;) measures the opposition of 
a; and by in the assertion "a; is at least as good as bp”. The global concordance 
index c(a;,b;,) measures the concordance of a; and by on all criteria in the asser- 
tion ”a; outranks bp”. The degree of credibility of the outranking relation denoted 
as 0 (a;,b,) expresses to which extent ”a; outranks bp” according to c(a;,b,) and 
d;(a;,bn) for all criteria. The main steps of ET method are described below: 


1. Concordance Index: The concordance index c(a;, bp) € [0,1] between the al- 
ternative a; and the category Cp is computed as the weighted average of partial 
concordance indexes c;(a;, bn), that is 


c(ai,bn) = >) wjcjlai bn) (1) 


jes 


where the weights w; € [0,1] represent the relative importance of each crite- 
rion g;(.) in the evaluation of the global concordance index. They must sat- 
isfy Sjcywj = 1. The partial concordance index c;(a;,b,) € [0,1] based on 
a given criterion g;(.) is computed from the difference of the criteria eval- 
uated for the profl b,, and the criterion evaluated for the alternative a;. If 
the difference g;(bn) — g;(a;) is less (or equal) to a given preference thresh- 
old q;(g;(bn)) then a; and C, are considered as different based on the crite- 
rion g;(.) so that a preference of a; with respect to Cp can be clearly done. 
If the difference g;(b;,) — g;(aj) is strictly greater to another given threshold 
pj(gj(b,)) then a; and C; are considered as indifferent (similar) based on g;(.)). 
When gj(hn) — gj(ai) € [q;(g;(bn)) p;(g;(bn))], the partial concordance index 
cj(ai,b;) is computed from a linear interpolation. Mathematically, the partial 
concordance index is obtained by: 


1 if gj(bn) —g)(ai) < 9)(8/(bn)) 
cjlairbn) $40 if gin) gj jla) > pj(gj(bn)) (2) 
gj(a i)+pj Zj (bn)) (bn : 
ACHE Ene cn otherwise 
2. Discordance Index: The discordance index between the alternative a; and the 
category Cp depends on a possible veto condition expressed by the choice of a 
veto threshold v;(g;(bn)) imposed on some criterion g;(.). The (global) discor- 
dance index d (a;i, bp) is computed from the partial discordance indexes: 


1 if gj(bn) — gj(ai) > v;(gj(bn)) 


a 
d;(aj,bn) 210 a e g(a) < pj(gj(bn)) (3) 
gjlbn)-g;jlai)—p;lgj(bn 
S ET i, ST a : = i’ otherwise 
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One defi es by V the set of indexes j € J where the veto applies (where the 
partial discordance index is greater than the global concordance index), that is 


V > {j € J|d;(ai, bn) > c(ai,bn)} (4) 
Then a global discordance index can be define [12] as 


1 if V=0 


1—d;(a;,b . (5) 
ljev —— if V#0 


d(ai,bn) * l 

3. Global Credibility Index: In ET method, the (global) credibility index p (a;, bp) 

is computed by the simple discounting of the concordance index c(a;, bp) given 

by (1) by the discordance index (discounting factor) d (a;, bpn) given in (5). Math- 
ematically, this is given by 


p(ai,bn) = clai,bn)d (ai, by) (6) 


4. Assignment Procedure: The assignment of a given action a; to a certain cate- 
gory Cp results from the comparison of a; to the profil defi ing the lower and 
upper limits of the categories. For a given category limit b}, this comparison re- 
lies on the credibility of the assertions a; outranks b}. Once all credibility indexes 
p(ai,b,) for i = 1,2,...,m and h = 1,2,...,k have been computed, the assign- 
ment matrix M £ [p(a;,b;,)] is available for helping in the f nal decision-making 
process. In ELECTRE TRI method, a simple å -cutting level strategy (for a given 
choice of A € [0.5, 1]) is used in order to transform the fuzzy outranking relation 
into a crisp one to determine if each alternative outranks (or not) each category. 
This is done by testing if p(a;,b;,) > A. If the inequality is satisfie , it means 
that indeed a; outranks the category C}. Based on outranking relations between 
all pairs of alternatives and prof les of categories, two approches are proposed 
in ELECTRE TRI to fnally assign the alternatives into categories, see [5] for 
details: 


e Pessimistic (conjunctive) approach: a; is compared with by, by_1, by_2, ..., 
until a; outranks bp where A < k. The alternative a; is then assigned to the 
highest category C; if p(a;,b,) >A for a given threshold 1. 

e Optimistic (disjunctive) approach: a; is compared with b1, bo, ... bhn, ... until 
bn outranks a;. The alternative a; is assigned to the lowest category Cp for 
which the upper prof le b} is preferred to qj. 


The objective and motivation of this paper is to develop a new Belief Function based 
ET method taking into account the potential of BF to model uncertainties. The whole 
BF-ET method is under development and will be presented and evaluated on a de- 
tailed practical example in a forthcoming publication. Due to space limitation con- 
straints, we just present here what we propose to compute the new concordance and 
discordance indexes useful in our BF-ET. 
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2 Limitations of the Classical Electre Tri 


ET method remains rather based on heuristic approach than on a theoretical one for 
each of its steps. Belief functions can improve ET method because of their ability 
to model and manage conf icting as well as uncertainty information in a theoretical 
framework. We only focus here on steps | and 2 and we propose a solution to over- 
come their limitations in the next section. 


Example 1: Let’s consider g;(a;) € [0,100], and let’s take g;(b}) = 50 and the fol- 
lowing thresholds: q;(g;(bn)) = 20 (indifference threshold), p ;(gj(bn)) = 25 (pref- 
erence threshold) and v;(g;(bn) = 40 (veto threshold). Then the local concordance 
and discordances indexes obtained in steps 1 and 2 of ET are shown on the Fig. [I] 


c (apb) and d(a,b,) ~ ELECTRE TRI modeling 








E 
d(a,b.) 























i i i i i i i i 
oO 10 20 30 40 50 60 70 80 90 100 
g(a) 


Fig. 1 Example of partial concordance and discordance indexes. 


From this very simple example, one sees that ET modeling of partial concordance 
and discordance indexes is not very satisfactory since there is no clear (explicit 
and consistent) modeling of the uncertainty area where the action a; is not totally 
discordant, nor totally concordant with the prof le b}. In such simplistic modeling, 
there exist points g;(a;) (lying on the slope of the blue or red curves) that can be 
not totally concordant while being totally not discordant (and vice-versa), which is 
counter-intuitive and rather abnormal. This drawback will be solved using our new 
sigmoidal basic belief assignment (bba) modeling presented in the next section. 


3 Sigmoidal Model for Concordance and Discordance Indexes 


In fact, there are several ways to compute partial concordances and discordances 
indexes and to combine them in order to provide the global credibility indexes 
p(ai,bn). Electre Tri proposes a simple and basic approach based on hard threshold- 
ing techniques for doing this. It can fail to work efficie tly in practice in some cases, 
or may require a lot of experience to calibrate/tune all setting parameters in order to 
apply it to get pertinent results for decision-making support. Usually, a sensitivity 
analysis must be done very carefully before applying ET in real applications. Here, 
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we propose a more f exible approach based on sigmoidal modeling where no hard 
thresholding technique is required. 

In ET approach, we are mainly concerned in the evaluation of the credibility in- 
dexes p (ai,bn) € [0,1] fori = 1,2,...,m and h =1,2,...,k (step 3) from which the 
fnal decision (assignment) will be drawn in step 4. Step 3 is conditioned by the 
results of steps 1 and 2 which can be improved using belief functions. For such pur- 
pose, we consider, a binary frame of discernment? © £ {c,c} where c means that 
the alternative a; is concordant with the assertion ”a; is at least as good as prof le 
br”, and c means that the alternative a; is opposed (discordant) to this assertion. This 
must obviously be done with all the assertions to check in the ET framework. The 
basic idea is for each pair (a;, bp) to evaluate its bba mj,(.) define on the power-set 
of ©, denoted 2°. Such bba’s have of course to be def ned from the combination 
(fusion) of the local bba’s m/,(.) evaluated from each possible criteria g;(.) (as in 
steps 1 and 2). The main issue is to derive the local bba’s m}, (.) def ned in 2° from 
the knowledge of the criteria g;(.) and preference, indifference and veto thresholds 
pj(gj(bn)). q;(g;(bn)) and v;(g;(bn)) respectively. It turns out that this can be easily 
obtained from the new method of construction of bba presented in |4] and adapted 
here in the ET context as follows: 


e Let gj(a;) be the evaluation of the criterion g;(.) for the alternative a;, follow- 
ing ET approach when g;(a;) > g;(bn) — 9;(g;(bn)) then the belief in concordance 
c must be high (close to one), whereas it must be low (close to zero) as soon as 
gj(ai) < gj(bn) — pj(gj(bn)). Similarly, the belief in discordance c must be high 
(close to one) if g;(ai) < g;(bn) — v;(g;(bn)), and it must be low (close to zero) 
when gj(ai) > gj(bn) — pj(gj(bn)). Such behavior can be modeled directly from 
the sigmoid functions def ned by f,;(g) = 1/(1 +e) where g is the criterion 
magnitude of the alternative under consideration; ¢ is the abscissa of the infec- 
tion point of the sigmoid. s/4 is the slope? of the tangent at the inf ection point. It 
can be easily verif ed that the bba m? (.) satisfying the expected behavior can be 
obtained by the fusion* of the two following simple bba’s def ned by: where the 
abscisses of inf ection points are given by te = g;(b;) — 5(p;(gj(bn)) +q;(g;(bn))) 
and te = g;(bn)— 5(P; (g;(bn)) +v;(g;(bn))) and the parameters s, and sz are given 


by? se = 4/(pj(gj(bn)) — aj(gj(bn))) and se = 4/(v;(gj(bn)) — pilej (bnr))). 


Table 1 Construction of mı (.) and m2(.). 





c 0 f-szte(2) 


2 Here we assume that Shafer’s model holds, that is cNc = 0. 

3 i.e. the ratio of the vertical and horizontal distances between two points on a line; zero if 
the line is horizontal, undef ned if it is vertical. 

4 With averaging rule, PCR5 rule, or Dempster-Shafer rule [8]. 

5 The coeff cient 4 appearing in s, and sz expressions comes from the fact that for a sigmoid 
of parameter s, the tangent at its inf ection point is s/4. 
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e From the setting of threshold parameters p;(gj(bn)), gj(gj(bn)) and v;(g;(bn)), 
it is easy to compute the parameters of the sigmoids (t.,s-) and (tz,tz), and thus to 
get the values of bba’s mı(.) and m2(.). Once this has been done the local bba 
m},(.) is computed by the fusion (denoted ®) of bba’s mj(.) and m2(.), that is 
mi, (.) = [mı ®mp](.). As shown in [4], the choice of a particular rule of combination 
(Dempster, PCRS, or hybrid rule) has only a little impact on the result of the com- 
bined bba m’, (.). But since PCRS proposes a better management of conf icting bba’s 
yielding to more specif c results than with other rules [1], we use it to combine m; (.) 
with m(.) to compute m’, (.) associated with the criterion g;(.) and the pair (a;, bn). 
In adopting such sigmoidal modeling, we get now from m’,(.) a fully consistent 
and elegant representation of local concordance c;(a;,b;,) (step 1 of ET), local dis- 
cordance d;(a;,b;,) (step 2 of ET), as well as of the local uncertainty u;(a;, bn) by 
considering: c;(ai,bn) =m’, (c) € [0,1], d;(ai,bn) = mi, (c) € [0,1] and u;(ai,bn) = 
m+, (cUc) € [0,1]. Of course, one has also c;(a;,b_) + dj(ai,b,) +uj(ai,bn) = 1. 


4 Example of a Sigmoidal Model 


If one takes back the example 1, the infection points of the sigmoids fi (g) £ 
feete(g) and f(g) * f-sc1-(g) have the following abscisses te = 50 — (25+20)/2= 
27.5 and te= 50 — (25 +40)/2 = 17.5 and parameters se = 4/ (25 — 20) = 4/5 = 0.8 
and sz = 4/(40 — 25) = 4/15 ~ 0.2666. The two sigmoids fi (g;(a;)) and f2(g;(ai)) 
are shown on the Fig. 2. 


Sigmoids used in BF-ELECTRE TRI modeling 








m 








n 

















Fig. 2 f\(g;(ai)) and fo(g;(a;)) sigmoids. 


It is interesting to note the resemblance of Fig. 2 with Fig. 1. From these sig- 
moids, the bba’s m; (.) and m2(.) are computed according to Table 1 and shown on 
the Figure 3. 
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Fig. 3 Bba’s mı(.) and m2(.) to combine. 


The construction of the consistent bba ml (.) is obtained by the PCR5 fusion of 
the bba’s m;(.) and m(.). The result is shown on Fig. 4. 


pors!) = PCRS fusion of m,(.) with m,(.) 
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Fig. 4 m? (.) obtained from the PCRS fusion of mı (.) with m2(.). 


From this new sigmoidal modeling, we can compute the local bba’s m/,(.) de- 
rived from the knowledge of criterion g;(.) and setting parameters. This is a smooth 
appealing and elegant technique to build all the local bba’s: no hard thresholding is 
necessary because of the continuity of sigmoid functions. 

One can then compute the global concordance and discordance indexes of steps 
1 and 2 from the computation of the combined bba m;,(.) resulting of the fusion of 
local bba’s mi,(.) taking eventually into account their importance and reliability® (if 
one wants). This can be done using the recent fusion techniques proposed in [9], 
or by a simple weighted averaging. From mj,(.) we can use the same credibility 
index as in step 3 of ET, or just skip this third step and def ne a decision-making 
based directly on the bba mj;(.) using classical approaches used in belief function 
framework (say the max of belief, plausibility, or pignistic probability, etc). 


6 In classical ET, the reliability of criteria is not taken into account. 
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5 Conclusions 


After a brief presentation of the classical ET method, we have proposed a new ap- 
proach to model and compute the concordance and discordance indexes based on 
belief functions in order to overcome the limitations of steps 1 and 2 of the ET ap- 
proach. The advantages of our modeling is to provide an elegant and simple way not 
only to compute the concordance and discordance indexes, but also the uncertainty 
level that may occur when information appears partially concordant and discordant. 
The Improvements of other steps of ET method are under development. In future 
reaserch works, we will evaluate and compare on real MCDA problem our BF-ET 
with the original ET method and with other belief functions based methods already 
available in MCDA frameworks [10, 11]. 
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Abstract—Dempster-Shafer evidence theory is very important 
in the fields of information fusion and decision making. However, 
it always brings high computational cost when the frames of 
discernments to deal with become large. To reduce the heavy 
computational load involved in many rules of combinations, the 
approximation of a general belief function is needed. In this paper 
we present a new general principle for uncertainty reduction 
based on hierarchical proportional redistribution (HPR) method 
which allows to approximate any general basic belief assignment 
(bba) at a given level of non-specificity, up to the ultimate level 
1 corresponding to a Bayesian bba. The level of non-specificity 
can be adjusted by the users. Some experiments are provided to 
illustrate our proposed HPR method. 

Index Terms—Belief functions, hierarchical proportional redis- 
tribution (HPR), evidence combination, belief approximation. 


I. INTRODUCTION 


Dempster-Shafer evidence theory, also called belief function 
theory [1], is an interesting and flexible tool to deal with 
imprecision and uncertainty for approximate reasoning. It 
has been widely used in many applications, e.g., information 
fusion, pattern recognition and decision making [2]. 

Although evidence theory is successful in uncertainty mod- 
eling and reasoning, high computational cost is a drawback 
which is often raised against evidence theory [2]. In fact, the 
computational cost of evidence combination increases expo- 
nentially with respect to the size of the frame of discernment 
(FOD) [3]-[5]. To resolve such a problem, three major types 
of approaches have been proposed by he researchers. 

The first type is to propose efficient procedures for perform- 
ing exact computations. For example, Kennes [6] proposed an 
optimal algorithm for Dempster’s rule of combination. Bar- 
nett’s work [7] and other works [8] are also the representatives. 

The second type is composed of Monte-Carlo techniques. 
See details in the paper of Moral and Salmeron [9]. 

The third type is to approximate (or simplify) a belief 
function to a simpler one. The papers of Voorbraak [4], Dubois 
and Prade [10] are seminal works in this type of approaches. 
Tessem proposed the famous k—l— <x approximation approach 


*This work was supported by National Natural Science Foundation of 
China (Grant No.61104214), Fundamental Research Funds for the Cen- 
tral Universities, China Postdoctoral Science Foundation (No.2010048 1337, 
No.201104670)and Research Fund of Shaanxi Key Laboratory of Electronic 
Information System Integration (No.201101Y17). 


Originally published as Dezert J., Han D., Liu Z., Tacnet J.-M., 
Hierarchical Proportional Redistribution principle for uncertainty 
reduction and bba approximation, WCICA2012 Beijing, China, 
July 2012, and reprinted with permission. 


[3]. Grabisch proposed some approaches [11], which can build 
a bridge between belief functions and other types of uncer- 
tainty measures or functions, e.g., probabilities, possibilities 
and k-additive belief function (those belief functions whose 
cardinality of the focal elements are at most of k). Based on 
pignistic transformation in transferable belief model (TBM), 
Burger and Cuzzolin proposed two types of k-additive belief 
functions [12]. Denceux uses hierarchical clustering strategy 
to implement the inner and outer approximation of belief 
functions [13]. 

In this paper, we focus on the approximation approach of 
belief functions. This first reason obviously is that it can reduce 
the computational cost of evidence combination. Furthermore, 
human find that it is not intuitive to attach meaning to focal 
elements with large cardinality [14]. Belief approximation can 
either reduce the number or the cardinalities of focal elements, 
or both of them can be reduced. Thus by using belief function 
approximation, we can obtain a representation which is more 
intuitive and easier to process. We propose a new method 
called hierarchical proportional redistribution (HPR), which is 
a general principle for uncertainty reduction, to approximate 
any general basic belief assignment (bba) at a given level of 
non-specificity, up to the ultimate level 1 corresponding to a 
Bayesian bba. That is, our proposed approach can generate 
an intermediate object between probabilities and original be- 
lief function. The level of non-specificity can be controlled 
by the users through the adjusting of maximum cardinality 
of remaining focal element. Our proposed approach can be 
considered as a generalized k-additive belief approximation. 
Some experiments are provided to illustrate our proposed 
HPR approach and to compare it with other approximation 
approaches. 


II. BASICS IN EVIDENCE THEORY 


In Dempster-Shafer evidence theory [1], the frame of 
discernment (FOD) denoted by © is a basic concept. The 
elements in © are mutually exclusive. Suppose that 2° denotes 
the powerset of FOD and define the function m : 2° — [0,1] 
as the basic belief assignment (bba) satisfying: 


are m(A) = 1, m(@) =0 (1) 
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A bba is also called a mass function. Belief function (Bel) 
and plausibility function (Pl) are defined below, respectively: 


Bel(A) = Dex m(B) (2) 
PUA) = > pag ™B) (3) 


Suppose that m1, M2, ..., Mn are n mass functions, Dempster’s 
rule of combination is defined in (4): 
0, A=90 


NA,;=Al<i<n 


I] mif(Ai) ? 


NA, AO 1<i<n 


m(A) = 








Aii (4) 


Dempster’s rule of combination is used in DST to imple- 
ment the combination of bodies of evidence (BOEs). 

Evidence theory has been widely used in many application 
fields due to its capability of approximate reasoning and 
processing of uncertain information. However, as referred 
in introduction section, there also exists the drawback of 
high computational cost in evidence combination. Several 
approaches have been proposed accordingly, which includes 
efficient algorithms [6]-[8] for evidence combination, the 
Monte-Carlo techniques and the approach of belief function 
approximation [9]. We prefer to use the belief approximation 
approach [10]-[13] to reduce the computational cost needed 
in the combination operation because the approximation ap- 
proach reduces the computational cost and also allow to deal 
with smaller-size focal elements, which is more intuitive for 
human to catch the meaning [14]. In the next section, we recall 
some well-known basic approximation approaches. 


III. BBA APPROXIMATION APPROACHES 


1) k — l — x approach: The approach of k — l — x was 
proposed by Tessem [3]. The simplified or compact bba 
obtained by using k — l — x satisfies: 

1) keep no less than k focal elements; 

2) keep no more than / focal elements; 

3) the mass assignment to be deleted is no greater than zx. 

In algorithm of k—l— zx, the focal elements of a original bba 
are sorted according to their mass assignments. Such algorithm 
chooses the first p focal elements such that k < p < l 
and such that the sum of the mass assignments of these 
first p focal elements is no less than 1 — x. The deleted 
mass assignments are redistributed to the other focal elements 
through a normalization. 

2) k-additive belief function approximation: Given a bba 
m : 2° — [0,1], the k-additive belief function [11], [12] 
induced by the mass assignment is defined in Eq.(5). Suppose 
that B C ©, 


A)-|B| 





me(B)=m(B)+ D WE, YIBI<k 
ADB,ACO,|A|>k 

my(B) = 0, V|B| >k 

where 
k k 
WY |A|! 
(Al, k) | (6) 
AAA | GaN 


is average cardinality of the subsets of A of size at most k. 

It can be seen that for k-additive belief approximation, the 
maximum cardinality of available focal elements is no greater 
than k. 

In this section, k — l — x approach and k-additive belief 
function approximation approach are introduced, which will 
be compared with our proposed approach introduced in next 
section. There also other types of bba approximation approx- 
imation approaches, see details in related references. 


IV. HIERARCHICAL PROPORTIONAL REDISTRIBUTION 
APPROXIMATION 


In this paper we propose a hierarchical bba approximation 
approach called hierarchical proportional redistribution (HPR), 
which provides a new way to reduce step by step the mass 
committed to uncertainties. Ultimately an approximate mea- 
sure of subjective probability can be obtained if needed, i.e. 
a so-called Bayesian bba in [1]. It must be noticed that this 
procedure can be stopped at any step in the process and thus 
it allows to reduce the number of focal elements in a given 
bba in a consistent manner to diminish the size of the core 
of a bba and thus reduce the complexity (if needed) when 
applying also some complex rules of combinations. We present 
here the general principle of hierarchical and proportional 
reduction of uncertainties in order to obtain approximate bba’s 
at different non-specificity level we expect. The principle of 
redistribution of uncertainty to more specific elements of the 
core at any given step of the process follows the proportional 
redistribution already proposed in the (non hierarchical) DSmP 
transformation proposed recently in [5]. Thus the proposed 
HPR can be considered as a bba approximation approach 
inspired by the idea of DSmP. 

Let’s first introduce two new notations for convenience and 
for concision: 


1) Any element of cardinality 1 < k < n of the power 
set 2° will be denoted, by convention, by the generic 
notation X(k). For example, if © = {A,B,C}, then 
X(2) can denote the following partial uncertainties 
AUB, AUC or BUC, and X(3) denotes the total 
uncertainty AU BUC. 

2) The proportional redistribution factor (ratio) of width n 
involving elements X and Y of the powerset is defined 
as (for X Æ and Y # Ø) 


m(Y) +e: |X| 


Di- m(Y) +e- |X| 





R.(Y, X) 4 (7) 


where e is a small positive number introduced here to 
deal with particular cases where X) ycx m(Y)=0. 
|X|—|¥|=s 
By convention, we will denote R(Y,X) = Ri(Y,X) 
when we use the proportional redistribution factors of 
width s = 1. 
The HPR is obtained by a step by step (recursive) proportional 
redistribution of the mass m(X(k)) of a given uncertainty 
X (k) (partial or total) of cardinality 2 < k < n to all the least 
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specific elements of cardinality k— 1, i.e. to all possible X (k— 
1), until & = 2 is reached. The proportional redistribution is 
done from the masses of belief committed to X (k—1) as done 
classically in DSmP transformation. The “hierarchical” masses 
mMp(.) are recursively (backward) computed as follows. Here 
Mn k) represents the approximate bba obtained at the step n—k 
of HPR, i.e., it has the maximum focal element cardinality of 
k. 


mn(n—1)(X(n — 1)) = m(X (n — 1))+ 
3 [m(X(n))- R(X (n = 1), X (n))]; 


X(n)DX(n—1) 
X(n),X(n—1)E22 


Mnr(n—1)(A) = m(A), VIA] <n- 1 
(8) 
Mn(n—1)(-) is the bba obtained at the first step of HPR 


(n — (n — 1) = 1), the maximum focal element cardinality 
of Mh(n—1) is n— 1. 


mn(n2)(X(n — 2)) = M(X (n — 2))+ 


[mr(n—1)(X(n—1))-R(X(n—2), X(n—1))] 
X(n—1)DX (n—2) 
X (n—2),X (n—1)E€29 


Mh(n—2) (A) = Mh(n—1) (A), YIA] <n-2 
(9) 
Mn(n—2)(-) is the bba obtained at the second step of HPR 


(n — (n — 2) = 2), the maximum focal element cardinality of 
Mnr(n—2) IS n — 2. 


This hierarchical proportional redistribution process can 
be applied similarly (if one wants) to compute mp(n—3)(.), 
Mnh(n—4)(-)> = Mara) (+), MaaC) with 


mpo) (X (2)) = m(X(2))+ 
>o [macsy(X(3)) - R(X (2), X(3))] 


X(3)D.X (2) 
X(3),X(2)E2° 


™Mn(2)(A) = mre) (A), VIA] <n — 2 
(10) 


Mn 2)(-) is the bba obtained at the first step of HPR (n — 2), 
the maximum focal element cardinality of m2) is 2. 

Mathematically, for any X(1) € ©, ie. any 6; E€ Oa 
Bayesian belief function can be obtained by HPR approach 
in deriving all possible steps of proportional redistributions of 
partial ignorances in order to get 


In fact, mp1)(-) is a probability transformation, called here 
the Hierarchical DSmP (HDSmP). Since X(n) is unique and 
corresponds only to the full ignorance 01 U 02 U...U On, the 
expression of mp(X(n — 1)) in Eq.(10) just simplifies as 


Mn(n—1)(X(n = 1)) = m (X(n — 1))+ 


m(X(n))-R(X(n—1),X(n)) (12) 


Because of the full proportional redistribution of the masses 
of uncertainties to the elements least specific involved in 
these uncertainties, no mass of belief is lost during the step 
by step hierarchical process and thus at any step of HPR, 
the sum of masses of belief is kept to one, and of course the 
Hierarchial DSmP also satisfies }7¥(1)c9e Mn(X(1)) = 1. 


Remark: For some reasons depending of applications, it is 
also possible to easily modify this HPR approach with little 
effort into a constrained HPR version (CHPR for short) by 
forcing the masses of some partial ignorances of cardinality 
k +1 to be (proportionally) redistributed back only to a subset 
of the partial ignorances of cardinality k included in them. 
This possibility has not be detailed here due to space limitation 
constraint and its little technical interest. 


V. EXAMPLES 


In this section we show in details how HPR can be applied 
on very simple different examples. So let’s examine the 
three following examples based on a simple 3D frame of 
discernment © = {01, 02,03} satisfying Shafer’s model. 

A. Example 1 


Let’s consider the following bba: 
m6, U 62) = 0.15, 
m(02 U 63) = 0.05, 


m(63) = 0.03, 
m(0ı U 63) = 0.20, 
m(0ı U A2 U 03) = 0.30. 


We apply the hierarchical proportional redistribution (HPR) 
principle with « = O in this example because there is no 
mass of belief equal to zero. It can be verified that the result 
obtained with small positive « parameter remains (as expected) 
numerically very close to that obtained with e = 0. 

The first step of HPR consists in redistributing back m(0,U 
02U03) = 0.30 committed to the full ignorance to the elements 
0; U 02, 01 U 03 and 02 U 03 only, because these elements 
are the only elements of cardinality 2 that are included in 
6, U 02 U 03. Applying the Eq. (8) with n = 3, one gets when 
X (2) = 6; U 62, 6; U 03 and 6; U 2 the following masses. 


mMn(2) (01 U 02) = Mm(01 U 02) + M(X (3)) - R(A1 U 02, X (3)) 
= 0.15 + (0.30 - 0.375) = 0.2625 


because R(01 U 02, X(3)) 0.15 = 0.375. 


eca = 0.15-+0.20+0.05 
Similarly, one gets 


Mh(2) (0i U 63) = m(64 U 03) + m(X(3)) s R(0ı U 03, X(3)) 
= 0.20 + (0.30 - 0.5) = 0.35 
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0.20 


because R(0, U 63, X(3)) = 0.15+0.20-+0.05 


= 0.5, and also 


™Mn(2) (92 U 03) = m(02 U 03) + m(X(3)) - R(A2 U 03, X(3)) 
= 0.05 + (0.30 - 0.125) = 0.0875 


0.05 


= yi0 2070.05 — 9-125. 


because R(62 U 63, X(3)) 


Now, we go to the next step of HPR principle and one 
needs to redistribute the masses of partial ignorances X (2) 
corresponding to 0; U 62, 6; U 03 and 02 U 03 back to the 
singleton elements X(1) corresponding to 01, 02 and 03. We 
use Eq. (11) for doing this as follows: 

™Mnp~(1) (01) = m(61) + mp (A1 U 62) s R(0ı, 0i U 02) 

+ mrp(O1 U 03) : R(61, 0i U 03) 
= 0.10 + (0.2625 - 0.3703) + (0.35 - 0.7692) 
= 0.10 + 0.0972 + 0.2692 = 0.4664 





because 
0.10 
61,6, U b2) = ————— 7x 0. 
P ron a 
0.10 
R(61, 61 U 03) = 0102003 ~ 0.7692 





Similarly, one gets 


Mra) (92) = mM(O2) + mp, (01 U 02) + R(G2, 01 U 02) 
+ Ma (02 U 03) - R(02, 02 U 03) 
~ 0.10 + (0.2625 - 0.6297) + (0.0875 - 0.85) 
= 0.17 + 0.1653 + 0.0744 = 0.4097 





because 
0.17 
R(62, 61 U 02) = 0104017 ~ 0.6297 
0.17 
R(62, 02 U 63) = 0.17 + 0.03 = 0.85 


and also 


Mpa) (93) = m(O3) + Mma (01 U 03) - R(03, 01 U 03) 
+ Ma (02 U 83) - R(03, 02 U 03) 
= 0.03 + (0.35 - 0.2307) + (0.0875 - 0.15) 
= 0.03 + 0.0808 + 0.0131 = 0.1239 





because 
0.03 
03,01 U0) = — > x02 
Tata) guo 780" 
0.03 
b3, b2 U b3) = ——— = 0.1 
R(@s, 2 U 4s) 0174003 7° 


Hence, the result of final step of HPR is: 


mri) (91) = 0.4664, 


mana) (62) = 0.4097, 


We can easily verify that 


Mna) (01) + Maca) (92) + Mra) (3) = 1. 









































10,005} 
Step 1 {6,,0,} {6,03 (0,053 
Step 2 {63 {83 {8} 
Figure 1. Illustration of Example 1 


Table I 
EXPERIMENTAL RESULTS OF HPR FOR EXAMPLE 1. 









































Focal elements Tak) GE approximate = =F 
01 0.1000 0.1000 0.4664 
02 0.1700 0.1700 0.4097 
03 0.0300 0.0300 0.1239 
01 U 02 0.1500 0.2625 0.0000 
01 U 03 0.2000 0.3500 0.0000 
02 U 03 0.0500 0.0875 0.0000 
01 U 02 U 03 0.3000 0.0000 0.0000 








The procedure can be illustrated in Fig. 1 below. The 
approximate bba at each step with different maximum focal 
elements’ cardinality are listed in Table I. 

To compare our proposed HPR with the approach of k — 
l — x, we set the parameters of k — l — x to obtain bba’s 
with equal focal element number with HPR at each step. In 
Example 1, for HPR at first step, it can obtain a bba with 6 
focal elements. Thus we set k = l = 6, x = 0.4 for k— l — x 
to obtain a bba with 6 focal elements. Similarly, for HPR at 
second step, it can obtain a bba with 3 focal elements. Thus 
we set k = l = 3, x = 0.4 for k—l— x. Based on the approach 
of k — l — zx, the results are in Table II. 


Table II 
EXPERIMENTAL RESULTS OF k — | — x FOR EXAMPLE 1 












































Focal elements eee by = — 

0i 0.1031 0.0000 

02 0.1753 0.2573 

03 0.0000 0.0000 

01 U 02 0.1546 0.0000 

01 U 03 0.2062 0.2985 

02 U 03 0.0515 0.0000 

01 U 02 U 03 0.3093 0.4478 

B. Example 2 
Let’s consider the following bba: 

m(61) = 0, m(62) = 0.17, m(63) = 0.13, 
m(64 U 02) = 0.20, m(0ı U 03) = 0.20, 
m(02 U 03) = 0, m(0ı U A U 03) = 0.30 


The first step of HPR consists in redistributing back m(0,U 
02U03) = 0.30 committed to the full ignorance to the elements 
0ı U 62, and 01 U 03 only, because these elements are the only 
elements of cardinality 2 that are included in 601 U 02 U 03. 
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Applying Eq. (8) with n = 3, one gets when X(2) = 81 U 42, 
6, U 03 and 6; U 62 the following masses 


mn(2) (01 U 02) = m(64 U 02) + m(X (3)) : R(0ı U Oo, X(3)) 
= 0.20 + (0.3 - 0.5) = 0.35 


because 
0.20 


R(O1 U2, X(3)) = 0L 


0.5 





Similarly, one gets 


™Mn(2) (91 U 03) = m(01 U 03) + m(X(3)) - R(A, U 03, X(3)) 
= 0.20 + (0.3- 0.5) = 0.35 


because 
0.20 


= = 0.5 
0.20 + 0.20 + 0.00 





R(01 U 63, X (3)) 
and also 


Mp(2) (02 U 03) = m(02 U 03) + m(X (3)) 5 R(02 U 03, X(3)) 
= 0.00 + (0.3 - 0.0) = 0.0 


because 
B 0.0 = 
~ 0.20+0.20+0.00 — 


Now, we go to the next step of HPR principle and one 
needs to redistribute the masses of partial ignorances X (2) 
corresponding to 6; U 62, 6; U 03 and 02 U 03 back to the 
singleton elements X(1) corresponding to 01, 02 and @3. We 
use Eq. (11) for doing this as follows: 





R(62 U 43, X (3)) 


ma1)(01) = m(01) + mn(O1 U 02) - R(O1, 01 U 02) 
+ my (01 U 03) - R(01, 01 U 03) 
~ 0.00 + (0.35 - 0.00) + (0.35 - 0.00) 
= 0.00 + 0.00 + 0.00 = 0.00 


because 
0.00 
R(61, 01 U 62) = 0.00 + 0.17 = 0.00 
0.00 
01,0 = —————— = 0. 
BOE As) Saa 


Similarly, one gets 


Mara) (82) = m(02) + mn (01 U 02) - R(A2, 01 U 02) 
+ Ma (02 U 03) - R(02, 02 U 03) 
a~ 0.17 + (0.35 - 1) + (0.00 - 0.5667) 
= 0.17 + 0.35 + 0.00 = 0.52 





because 
0.17 


~ 0.00 +0.17 


0.17 
R@2,42U 03) = I7 +0.13 


R(02, 01 U 02) 


= 0.5667 


and also 


Mn) (93) = m(O3) + Mma (01 U 03) - R(O3, 01 U 03) 
+ Ma (02 U 63) - R(03, 02 U 03) 
æ 0.13 + (0.35 - 1) + (0.00 - 0.4333) 
= 0.13 + 0.35 + 0.00 = 0.48 


because 
0.13 
Rs, 01 U 03) = 0.13+0.00 
0.13 
669003) = ed 
R(@s, 2 U 8s) 0.7 +013 ~ 24933 


Hence, the result of final step of HPR is 
Mpa) (91) = 0.00, mri) (92) = 0.52, marr) (83) = 0.48 
and we can easily verify that 

Mna) (91) + Mha) (92) + Mha) (93) = 1. 


The HPR procedure of Example 2 with e = 0 is Fig. 2. 


















































{6,095} 
Step 1 {0,,0,} {10:8} 
Step 2 {0} {8} {6} 
Figure 2. Illustration of Example 2. 


If one takes e€ = 0, there is no mass that will be reassigned 
to {02 U 63} as illustrated in Fig. 2. But if one takes € > 0, 
HPR procedure of Example 2 is the same as that illustrated in 
Fig. 1, i.e., there also exist masses redistributed to {02 U 03} 
as illustrated in Fig. 1. That’s the difference between Fig. 1 
and Fig. 2. 

Suppose that e = 0.001, the HPR calculation procedure is 
as follows. 

The first step of HPR consists in distributing back m(6, U 
02U03) = 0.30 committed to the full ignorance to the elements 
0; U02, 01 U 03 and 62 U 03. Applying the Eq. (8) with n = 3, 
one gets when X (2) = 6,U62, 0,U63 and 01 U02 the following 
masses 


™Mn(2) (01 U 02) = M(01 U 02) + M(X (3)) - R(81 U 02, X(3)) 
= 0.20 + (0.3 - 0.4963) = 0.3489 


because 
0.20 + 0.001 -3 





R(01 U 02, X (3)) = 
(0, U 02, X(3)) (0.20 + 0.001 - 3) - 2 + (0.00 + 0.001 - 3) 


= 0.4963 


mMn(2) (01 U 03) = m(01 U 03) + m(X(3)) - R(A1 U 83, X(3)) 
= 0.20 + (0.3 - 0.4963) = 0.3489 
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because 
0.20 + 0.001 - 3 





61 U 02, X(3)) = 
R(1 U82, X(3)) = aq 5 9001-3) -2+ (0.00 + 0.0013) 


= 0.4963 


Mpn(2) (02 U 03) = m(02 U 03) + m(X (3)) à R(02 U 03, X(3)) 
= 0.00 + (0.3 - 0.0073) = 0.0022 


because 
0.001 -3 





R(@2 U 03, X(3)) = 
(02 U 03, X(3)) (0.20 + 0.001 - 3) - 2 + (0.00 + 0.001 - 3) 


= 0.0073 
Now, we go to the next step of HPR principle and one 
needs to redistribute the masses of partial ignorances X (2) 
corresponding to 0; U 62, 6; U 03 and 02 U 03 back to the 
singleton elements X(1) corresponding to 01, 02 and 03. We 
use Eq. (11) for doing this as follows: 
Mp1) (91) = m(O1) + Ma (01 U 02) - R(A1, 01 U 02) 
+ mp (01 U 03) 5 R(61, 0, U 03) 
= 0.00 + (0.3489 - 0.0115) + (0.3489 - 0.0149) 
= 0.00 + 0.0040 + 0.0052 = 0.0092 











because 
0.00 + 0.001 - 2 
R(O1, 0, U 602) = 
(01,1 U 82) (0.00 + 0.001 - 2) + (0.17 + 0.001 - 2) 
= 0.0115 
0.00 + 0.001 - 2 
R(O1, 01 U 03) = 
(01, 01 U 83) (0.00 + 0.001 - 2) + (0.13 + 0.001 - 2) 
= 0.0149 


Similarly, one gets 
Mp(1) (02) = m(02) + mrp(O1 U 02) $ R(02, 0i U 02) 
+ Mp (02 U 03) * R(62, b2 U 03) 
zæ 0.17 + (0.3489 - 0.9885) + (0.0022 - 0.5658) 
= 0.17 + 0.3449 + 0.0012 = 0.5161 











because 
0.17 + 0.001 - 2 
R(O2, 61 U 02) = 
(02, 01 U 02) (0.00 + 0.001 - 2) + (0.17 + 0.001 - 2) 
= 0.9885 
0.17 + 0.001 - 2 
R(O2, 02 U 03) = 
(02, 02 U 03) (0.17 + 0.001 - 2) + (0.13 + 0.001 - 2) 
zx 0.5658 


and also 
™Mn(1) (83) = m(O3) + Mma (01 U 03) - R(A3, 01 U 03) 
+ mnr(O2 U 03) : R(6s, 02 U 63) 
zæ 0.13 + (0.3489 - 0.9851) + (0.0022 - 0.4342) 
= 0.13 + 0.3437 + 0.0009 = 0.4746 
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because 
0.13 + 0.001 - 2 
R(@3, 0, U 03) = 
(83, 01 3) (0.13 + 0.001 - 2) + (0.00 + 0.001 - 2) 
= 0.9851 
0.13 + 0.001 - 2 
R(@3, 0g U 03) = 
(03, 02 U 03) (0.17 + 0.001 - 2) + (0.13 + 0.001 - 2) 
zx 0.4342 


Hence, the final result of HPR approximation is 


mpa) (01) = 0.0092, 


™Mn~(1) (02) = 0.5161, 


and we can easily verify that 
mnra) (01) + Maca) (92) + Mra) (O3) = 1. 


The bba’s obtained in each step are listed in Table II (e = 0) 
and Table IV (e = 0.001) 










































































Table III 
EXPERIMENTAL RESULTS OF HPR FOR EXAMPLE 2 (e€ = 0.001) 
Focal elements rence us ee = =F 
0 0.0000 0.0000 0.0000 
02 0.1700 0.1700 0.5200 
03 0.1300 0.1300 0.4800 
01 U 02 0.2000 0.3500 0.0000 
01 U 63 0.2000 0.3500 0.0000 
02 U 03 0.0000 0.0000 0.0000 
01 U 02 U 63 0.3000 0.0000 0.0000 
Table IV 
EXPERIMENTAL RESULTS OF HPR FOR EXAMPLE 2 (e€ = 0.001) 
Focal elements Mh(k) OE appeal = =F 
0 0.0000 0.0000 0.0092 
02 0.1700 0.1700 0.5141 
03 0.1300 0.1300 0.4746 
01 U 02 0.2000 0.3489 0.0000 
01 U 03 0.2000 0.3489 0.0000 
02 U 03 0.0000 0.0022 0.0000 
01 U 02 U 63 0.3000 0.0000 0.0000 


























When using k — l — x approach, the results are in Table V. 





























Table V 
EXPERIMENTAL RESULTS OF k — l — x FOR EXAMPLE 2 
m(-) obtained by k — l — x 
Focal elements k=l=6 k=T=3 
01 0.0000 0.0000 
62 0.1700 0.0000 
63 0.1300 0.0000 
01 U 02 0.2000 0.2857 
01 U 03 0.2000 0.2857 
02 U 03 0.0000 0.0000 
01 U 02 U 63 0.3000 0.4286 
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C. Example 3 
Let’s consider the following bba: 
m(61) =0, m(02)=0, m(3) = 0.70, 
m(0ı U 62) = 0, m(6, U 63) = 0, 
m(O2 U 63) = 0, m(0ı U 02 U 03) = 0.30 


In this example, the mass assignments for all the focal ele- 
ments with cardinality size 2 equal to zero. For HPR, when 
€ > 0, m(02 U 03) will be divided equally and redistributed to 
{01 U 62}, {01 U 03} and {02 U 63}. Because the ratios are 


R(0ı U f2, X(3)) = R(0ı U 03, X(3)) = R(02 U 03, X(3)) 


0.00 + 0.001 - 3 
= = 0.3333 
(0.00 + 0.001 - 3) -3 





For HPR, when e = 0, it can not be executed directly. This 
can show the necessity for the using of e. 

The bba’s obtained through HPR¿=0.001 at different steps 
are listed in Table VI 





























Table VI 
EXPERIMENTAL RESULTS OF HPR FOR EXAMPLE 3 (e€ = 0.001) 
Focal elements matel )- en ea =F 
6 0.0000 0.0000 0.0503 
02 0.0000 0.0000 0.0503 
03 0.7000 0.7000 0.8994 
6, U 02 0.0000 0.1000 0.0000 
6, U 03 0.0000 0.1000 0.0000 
62 U 03 0.0000 0.1000 0.0000 
6, U 02 U 03 0.3000 0.0000 0.0000 























When using k —l— <x approach, the results are in Table VII. 


Table VH 
EXPERIMENTAL RESULTS OF k — l — x FOR EXAMPLE 3 





























m(-) obtained by k — l — x 
Focal elements E=T=6 ae 
01 0.0000 0.0000 
02 0.0000 0.0000 
63 0.7000 0.7000 
01 U 02 0.0000 0.0000 
01 U 03 0.0000 0.0000 
02 U 03 0.0000 0.0000 
81 U 02 U 63 0.3000 0.3000 

















D. Example 4 (vacuous bba) 
Let’s consider the following bba: 
m(01) = 0, 

m(O, U 02) = 0, m 

m(02 U 03) = 0, m 


m(62) =0, m(@3) = 0, 

0; U 03) = 0, 

0, U 62 U 63) = 1 

In this example, the mass assignments for all the focal el- 
ements with cardinality size less than 3 equal to zero. For 


HPR, when € > 0, m(0ı U @2 U 63) will be divided equally 
and redistributed to {01 U 02}, {01 U 03} and {62 U 63}. 


Similarly, the mass assignments for focal elements with 
cardinality of 2 obtained in intermediate step will be divided 
equally and redistributed to singletons. This is due to € > 0. 

For HPR, when e = 0, it can not be executed directly. This 
can show the necessity for the using of e. The bba’s obtained 
through HPR,=o.001 at different steps are listed in Table VIII. 


Table VIII 





























EXPERIMENTAL RESULTS OF HPR FOR EXAMPLE 4 (e = 0.001) 
Focal elements ao )- appro mae ee =F 
0 0.0000 0.0000 0.3333 
62 0.0000 0.0000 0.3333 
63 0.0000 0.0000 0.3333 
61 U 62 0.0000 0.3333 0.0000 
6, U 03 0.0000 0.3333 0.0000 
62 U 03 0.0000 0.3333 0.0000 
6, U 62 U 03 1.0000 0.0000 0.0000 























When using k — l-— x approach, the results are in Table IX. 


Table IX 
EXPERIMENTAL RESULTS OF k — l — x FOR EXAMPLE 3 





























Focal elements ay = by E= = 
i 0.0000 0.0000 
02 0.0000 0.0000 
63 0.0000 0.0000 
0, U 02 0.0000 0.0000 
0, U 03 0.0000 0.0000 
62 U 03 0.0000 0.0000 
6, U 02 U 03 1.0000 1.0000 

















From the results of Example | — Example 4, we can see 
that based on k — 1 — x, the users can control the number of 
focal elements but can not control the maximum cardinality 
of focal elements. Although based on k — l — x, the number 
of focal elements can be reduced, the focal elements with big 
cardinality might also be remained. This is not good for further 
reducing computational cost and not good for human to catch 
the meaning. 


E. Example 5 


More generally, an approximation method 1 (giving mı(.)) 
is considered better than a method 2 (giving m2(.)) if both 
conditions are fulfilled: 1) if Jousselme’s distance of mı(.) 
to original bba m(.) is smaller than the distance of ma(.) 
to original bba m(.), i.e. d(mi,m) < d(m2,m); 2) if the 
approximate non-specificity value U (m1) is closer (and lower) 
to the true non-specificity value U(m) than U(m2), where 
Jousselme’s distance is defined in [16], and non-specificity 
[17] is given by U(m) = > m(A) log, |A]. 

ACO 


In this example, we make a comparison between HPR 
(method 1) and k-additive approach (method 2). We consider 
the FoD © = {61, 62,03, 04,05} and we generate randomly 
L = 30 bba’s by using the algorithm given below [15]: 

Input: © : Frame of discernment; 

Nmaz: Maximum number of focal elements 

Output: Bel: Belief function (under the form of a bba, m) 

Generate the power set of ©: P(O); 
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Generate a random permutation of P(O) > R(O); 
Generate a integer between 1 and Nmar > k; 
FOReach First k elements of R(©) do 
Generate a value within [0,1] + mj; 
END Normalize the vector m’(.) = [mj4,. 
(that is m(Ak) = mx); 
Algorithm 1: Random generation of bba. 


m] > m(.) 


We compute and plot d(m/,m), d(m,m), U(m), U (mł) 


and U (m? ) for several levels of approximation for j = 
1,2,..., (where j is the index of the Monte-Carlo run). The 
results are shown in Fig. 3 and indicate clearly the superiority 
of HPR over the k-additive approach. 


Max size of focal element =4 Max size of focal element =3 Max size of focal element =2 


14 1.4 
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Figure 3. 


Illustration of Example 5. 


We further use the Normalized Mean Square Error (NMSE) 
statistics defined by 
U (m$) — Um) 


Var(€;) 





ere 
NMSE; = ee (13) 


to evaluate the global quality of the approximation of the non- 
specificity by HPR (if i = 1) and by k-additive method (if i = 
2). Ge = leja ef] is the approximation error vector 
of method #i where e} = U(m?) — U(m), for j = 1,..., L. 
Var(é;) is the variance of ë}. The NMSE results are given in 
Table X below. 











Table X 
NMSE RESULTS OF EXAMPLE 5 
Max size of focal element | 4 3 2 
k-additive method 3.9003 | 21.8118 | 69.0191 
HPR method 3.9003 | 19.0264 | 61.9468 




















Table X shows that HPR outperforms k-additive method 
since it provides a lower NMSE, which means that in terms 
of information loss, HPR is better (it generates less loss) than 
the k-additive approximation method. 


VI. CONCLUSIONS 


We have proposed a new interesting and useful hierarchical 
method, called HPR, to approximate any bba. The non- 
specificity degree can be easily controlled by the user. Some 
examples were provided to show how HPR works, and to show 
its rationality and advantage in comparison with some well- 
known bba approximation approaches. In future works, we 
will compare this HPR method with more bba approximation 
methods. In this paper, we have used only the distance of 
evidence and non-specificity as performance criteria. We plan 
to develop a more efficient evaluation criteria for capturing 
more aspects of the information expressed in a bba to measure 
the global performances of a method, and to design a better 
bba approximation approach (if possible). 
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Hierarchical DSmP Transformation for Decision-Making 
under Uncertainty 
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Degiang Han 

Zhun-ga Liu 
Jean-Marc Tacnet 


Abstract—Dempster-Shafer evidence theory is widely used for 
approximate reasoning under uncertainty; however, the decision- 
making is more intuitive and easy to justify when made in 
the probabilistic context. Thus the transformation to approx- 
imate a belief function into a probability measure is crucial 
and important for decision-making based on evidence theory 
framework. In this paper we present a new transformation of 
any general basic belief assignment (bba) into a Bayesian belief 
assignment (or subjective probability measure) based on new 
proportional and hierarchical principle of uncertainty reduction. 
Some examples are provided to show the rationality and efficiency 
of our proposed probability transformation approach. 


Keywords: Belief functions, probabilistic transformation, 
DSmP, uncertainty, decision-making. 


I. INTRODUCTION 


Dempster-Shafer evidence theory (DST) [1] proposes a 
mathematical framework for approximate reasoning under 
uncertainty thanks to belief functions. Thus it is widely used in 
many fields of information fusion. As any theory, DST is not 
exempt of drawbacks and limitations, like its inconsistency 
with the probability calculus, its complexity and the miss 
of a clear decision-making process. Aside these weaknesses, 
the use of belief functions remains flexible and appealing 
for modeling and dealing with uncertain and imprecise in- 
formation. That is why several modified models and rules of 
combination of belief functions were proposed to resolve some 
of the drawbacks of the original DST. Among the advances 
in belief function theories, one can underline the transferable 
belief model (TBM) [2] proposed by Smets, and more recently 
the DSmT [3] proposed by Dezert and Smarandache. 

The ultimate goal of approximate reasoning under uncer- 
tainty is usually the decision-making. Although the decision- 
making can be done based on evidence expressed by a belief 
function [4], the decision-making is better established in a 
probabilistic context: decisions can be evaluated by assessing 
their ability to provide a winning strategy on the long run in a 
game theory context, or by maximizing return in a utility the- 
ory framework. Thus to take a decision, it is usually preferred 
to transform (approximate) a belief function into a probability 
measure. So the quality of such probability transformation 
is crucial for the decision-making in the evidence theory. 
The research on probability transformation has attracted more 
attention in recent years. 


Originally published as Dezert J., Han D., Liu Z., Tacnet J.-M., Hierarchical 
DSmP transformation for decision-making under uncertainty, in Proc. of 
Fusion 2012, Singapore, July 2012, and reprinted with permission. 


The classical probability transformation in evidence theory 
is the pignistic probability transformation (PPT) [2] in TBM. 
TBM has two levels: the credal level, and the pignistic level. 
Beliefs are entertained, combined and updated at the credal 
level while the decision making is done at the pignistic level. 
PPT maps the beliefs defined on subsets to the probability 
defined on singletons. In PPT, belief assignments for a com- 
pound focal element are equally assigned to the singletons 
included. In fact, PPT is designed according to the principle 
of minimal commitment, which is somehow related with 
uncertainty maximization. 

Other researchers also proposed some modified probability 
transformation approaches [5]—[13] to assign the belief assign- 
ments of compound focal elements to the singletons according 
to some ratio constructed based on some available information. 
The representative transformations include Sudano’s probabil- 
ity transformations [8] and Cuzzolin’s intersection probability 
transformation [13], etc. In the framework of DSmT, another 
probability transformation approach was proposed, which is 
called DSmP [9]. DSmP takes into account both the values 
of the masses and the cardinality of focal elements in the 
proportional redistribution process. DSmP can also be used in 
both DSmT and DST. For a probability transformation, it is 
always evaluated by using probabilistic information content 
(PIC) [5] (PIC being the dual form of Shannon entropy), 
although it is not enough or comprehensive [14]. A probabil- 
ity transformation providing a high probabilistic information 
content (PIC) is preferred in fact for decision-making since 
naturally it is always easier to take a decision when the 
uncertainty is reduced. 

In this paper we propose a new probability transformation, 
which can output a probability with high but not exagger- 
ated PIC. The new approach, called HDSmP (standing for 
Hierarchical DSmP) is implemented hierarchically and it fully 
utilize the information provided by a given belief function. 
Succinctly, for a frame of discernment (FOD) with size n, for 
k = n down to k = 2, the following step is repeated: the belief 
assignment of a focal element with size k is proportionally 
redistributed to the focal elements with size k — 1. The 
proportion is defined by the ratio among mass assignments of 
focal elements with size k —1. A parameter € is introduced in 
the formulas to avoid division by zero and warranty numerical 
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robustness of the result. HDSmP corresponds to the last step 
of the hierarchical proportional redistribution method for basic 
belief assignment (bba) approximation presented briefly in 
[16] and in more details in [17]. Some examples are given at 
the end of this paper to illustrate our proposed new probability 
transformation approach. Comparisons of our new HDSmP 
approach with the other well-known approaches with related 
analyses are also provided. 


II. EVIDENCE THEORY AND PROBABILITY 
TRANSFORMATIONS 


A. Brief introduction of evidence theory 


In Dempster-Shafer theory [1], the elements in the frame of 
discernment (FOD) © are mutually exclusive. Suppose that 2° 
represents the powerset of FOD, and one defines the function 
m : 2° — [0,1] as the basic belief assignment (bba), also 
called mass function satisfying: 


Dis m(A) = 1, m(0) =0 (1) 


Belief function (Bel) and plausibility function (Pl) are 
defined below, respectively: 


Bel(A) = Wes m(B) (2) 


P(A) = aes m(B) (3) 


Suppose that m1, M2, ..., Mn are n mass functions, Dempster’s 
rule of combination is defined in (4): 


0, A=9O 

mi (Ai) 
NA,;=Al<i<n 
NA; AO 1l<i<n 


m(A) = 








AZO (4) 


Dempster’s rule of combination is used in DST to accom- 
plish the fusion of bodies of evidence (BOEs). However, the 
final goal for decision-level information fusion is decision 
making. The beliefs should be transformed into probabilities, 
before the probability-based decision-making. Although there 
are also some research works on making decision directly 
based on belief function or bba [4], probability-based decision 
methods are more intuitive and have become the current 
trend to decide under uncertainty from approximate reasoning 
theories [15]. Some existing and well-known probability trans- 
formation approaches are briefly reviewed in the next section. 


B. Probability transformations used in DST framework 


A probability transformation (or briefly a “probabilization’’) 
is a mapping PT Belo — Po, where Belo means 
the belief function defined on © and Po represents a 
probability measure (in fact a probability mass function, 
pmf) defined on ©. Thus the probability transformation 
assigns a Bayesian belief function (i.e. probability measure) 
to any general (i.e. non-Bayesian) belief function. It is a 
reason why the transformations from belief functions to 
probability distributions are sometimes called also Bayesian 
transformations. 


The major probability transformation approaches used so 
far are: 

a) Pignistic transformation 

The classical pignistic probability was proposed by Smets 
[2] in his TBM framework which is a subjective and a non- 
probabilistic interpretation of DST. It extends the evidence 
theory to the open-world propositions and it has a range of 
tools including discounting and conditioning to handle belief 
functions. At the credal level of TBM, beliefs are entertained, 
combined and updated. While at the pignistic level, beliefs are 
used to make decisions by resorting to pignistic probability 
transformation (PPT). The pignistic probability obtained is 
always called betting commitment probability (in short, BetP). 
The basic idea of pignistic transformation consists of trans- 
ferring the positive belief of each compound (or nonspecific) 
element onto the singletons involved in that compound element 
split by the cardinality of the proposition when working with 
normalized bba’s. 

Suppose that © = {6,, 82, ..., On} is the FOD. The PPT for 
the singletons is defined as [2]: 
m(B) 


Sp (5) 


0EB, BE2e 


PPT is designed according to an idea similar to uncertainty 
maximization. In PPT, masses are not assigned discriminately 
to different singletons involved. For information fusion, the 
aim is to reduce the degree of uncertainty and to gain a more 
consolidated and reliable decision result. High uncertainty in 
PPT might not be helpful for the decision. To overcome this, 
some typical modified probability transformation approaches 
were proposed which are summarized below. 

b) Sudano’s probabilities 

Sudano [8] proposed Probability transformation propor- 
tional to Plausibilities (PrPl), Probability transformation pro- 
portional to Beliefs (PrBel), Probability transformation pro- 
portional to the normalized Plausibilities (PrNPI), Probability 
transformation proportional to all Plausibilities (PraPl) and 
Hybrid Probability transformation (PrHyb), respectively. As 
suggested by their names, different kinds of mappings were 
used. For the belief function defined on the FOD © = 
{61,...,0,}, they are respectively defined by 


m(Y) 
PrPI(0:) = Pl({@:}) - a ITE (6) 
ies, Z PO 
m(Y) 
PrBel(0;) = Bel({0;}) - SS pa L N 
oe X BAUD 
PrNPI(6;) = PCa (8) 


1— Z; Bel({4;}) 


PraPl(0;) = Bel({0:}) + -EPIO 


PIHO} © 
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m(Y) 
> PraPl(6;) 


Uj0;=Y 


PrHyb(6;) = PraPl(9;) - (10) 


2 


YE29,0;cY 


c) Cuzzolin’s intersection probability 

From a geometric interpretation of Dempster’s rule of 
combination, an intersection probability measure was proposed 
by Cuzzolin [12] from the proportional repartition of the 
total nonspecific mass (TNSM) for each contribution of the 
nonspecific masses involved. 


PI({0:}) — mit) 
2; (PULA;}) — m({9;})) 





CuzzP(0;) = 


m({0:}) + -TNSM (11) 


where 


TNSM = 1 — D m({0;}) = Dase m(A) 


d) DSmP transformation 
DSmP proposed recently by Dezert and Smarandache is 
defined as follows: 


(12) 





DSmP,(4;) = m({6:}) 
m(X 
PEA EE) Ds yee oe 
xXe22 fae 
xp Yi 


In DSmP, both the mass assignments and the cardinality 
of focal elements are used in the proportional redistribution 
process. The parameter of e is used to adjust the effect of 
focal element’s cardinality in the proportional redistribution, 
and to make DSmP defined and computable when encoun- 
tering zero masses. DSmP made an improvement compared 
with Sudano’s, Cuzzolin’s and PPT formulas, in that DSmP 
mathematically makes a more judicious redistribution of the 
ignorance masses to the singletons involved and thus increases 
the PIC level of the resulting approximation. Moreover, DSmP 
works for both theories of DST and DSmT. 

There are still some other definitions on modified PPT such 
as the iterative and self-consistent approach PrScP proposed 
by Sudano in [5], and a modified PrScP in [11]. Although 
the aforementioned probability transformation approaches are 
different, they are all evaluated according to the degree of 
uncertainty. The classical evaluation criteria for a probability 
transformation are the following ones: 

1) Normalized Shannon Entropy 

Suppose that P(@) is a probability mass function (pmf), 
where 0 € ©, |O| = N and the |O| represents the cardinality 
of the FOD ©. An evaluation criterion for the pmf obtained 
from different probability transformation is as follows [12]: 


— 2 P(0) logs(P(4)) 
CE) 
log, N 


i.e., the ratio of Shannon entropy and the maximum of 
Shannon entropy for {P(0)|0 € O},JO| = N. Clearly Eg 
is normalized. The larger Ey is, the larger the degree of 





Ex = (14) 


uncertainty is. The smaller Ey is, the smaller the degree 
of uncertainty is. When Ey= 0, one hypothesis will have 
probability 1 and the rest with zero probabilities. Therefore 
the agent or system can make decision without error. When 
Ey= 1, it is impossible to make a correct decision, because 
P(@), for all 6 € © are equal. 

2) Probabilistic Information Content 

Probabilistic Information Content (PIC) criterion [5] is an 
essential measure in any threshold-driven automated decision 
system. The PIC value of a pmf obtained from a probability 
transformation indicates the level of the total knowledge one 
has to draw a correct decision. 


1 
PIC(P) = 1+ N ZH loga (P(0)) (15) 
Obviously, PIC = 1 — Eq. The PIC is the dual of the 
normalized Shannon entropy. A PIC value of zero indicates 
that the knowledge to take a correct decision does not exist (all 
hypotheses have equal probabilities, i.e., one has the maximal 
entropy). 

Less uncertainty means that the corresponding probability 
transformation result is better to help to take a decision. 
According to such a simple and basic idea, the probability 
transformation approach should attempt to enlarge the belief 
differences among all the propositions and thus to achieve a 
more reliable decision result. 


II. THE HIERARCHICAL DSMP TRANSFORMATION 


In this paper, we propose a novel probability transformation 
approach called hierarchical DSmP (HDSmP), which provides 
a new way to reduce step by step the mass committed to 
uncertainties until to obtain an approximate measure of 
subjective probability, i.e. a so-called Bayesian bba in [1]. It 
must be noticed that this procedure can be stopped at any 
step in the process and thus it allows to reduce the number 
of focal elements in a given bba in a consistent manner to 
diminish the size of the core of a bba and thus reduce the 
complexity (if needed) when applying also some complex 
rules of combinations. We present here the general principle 
of hierarchical and proportional reduction of uncertainties 
in order to finally obtain a Bayesian bba. The principle of 
redistribution of uncertainty to more specific elements of the 
core at any given step of the process follows the proportional 
redistribution already proposed in the (non hierarchical) 
DSmP transformation proposed recently in [3]. 


Let’s first introduce two new notations for convenience and 

for concision: 

1) Any element of cardinality 1 < k < n of the power 
set 2° will be denoted, by convention, by the generic 
notation X(k). For example, if © = {A,B,C}, then 
X (2) denotes the following partial uncertainties AU B, 
AUC or BUC, and X(3) denotes the total uncertainty 
AUBUC. 

2) The proportional redistribution factor (ratio) of width s 
involving elements Y and X of the powerset is defined 
as (for X #) and Y # Ø) 
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mY) +e- |X| 
(m(Y) + €-|X]) 





R(Y,X) 2 (16) 
vex 

|x|-lT¥|=s 

where e is a small positive number introduced here to 

deal with particular cases where X) ycx m(Y)=0. 

In HDSmP, we just need to use he proportional redis- 

tribution factors of width n = 1, and so we will just 
denote R(Y, X) £ Ri(Y,X) by convention. 

The HDSmP transformation is obtained by a step by step 
(recursive) proportional redistribution of the mass m(X(k)) 
of a given uncertainty X(k) (partial or total) of cardinality 
2 < k < n to all the least specific elements of cardinality 
k — 1, ie. to all possible X(k — 1), until k = 2 is reached. 
The proportional redistribution is done from the masses of 
belief committed to X(k — 1) as done classically in DSmP 
transformation. Mathematically, HDSmP is defined for any 
X(1) € O, i.e. any 0; € O by 


HDSmP(X(1)) = m(X(1))+ 
D [rma(X(2)) R(X), X(2))] 


X(2)DX(1) 
X(1),X(2)€2° 


(17) 


where the “hierarchical” masses m;,(.) are recursively (back- 
ward) computed as follows: 


mp(X(n — 1)) = m(X(n — 1))+ 
5 [m(X(n)): R(X (n — 1), X (n))] 


X(n)DX(n—1) 
X(n),X(n—1)E2° 


mn(A) = m(A), VIA] <n-1 
(18) 


mp(X(n — 2)) = m(X (n — 2))+ 


[mn (X(n — 1))- R(X (n — 2), X(n — 1))] 
X(n—1)DX (n—2) 
X(n—2),X (n—1)E2° 


mn(A) = m(A), VIA] <n — 2 


(19) 
mp(X (2)) = m(X(2))+ 
XO o [ma(X(3)) - R(X (2), X(3))] 
X (3)D X (2) 
X(3),X(2)E2° 
mp(A) = m(A),V|A| < 2 
(20) 


Actually, it is worth to note that X (n) is in fact unique and 
it corresponds only to the full ignorance 9; U 02 U ... U On. 
Therefore, the expression of mp(X(n — 1)) in Eq. (18) just 
simplifies as 


mn(X(n—-1)) = m(X(n—1))+m(X(n))-R(X(n—1), X (n)) 


Because of the full proportional redistribution of the masses 
of uncertainties to the elements least specific involved in these 
uncertainties, no mass of belief is lost during the step by step 
hierarchical process and thus one gets finally a Bayesian bba 
satisfying })¥(1)e20 HDSmP(X(1)) =1. 


IV. EXAMPLES 


In this section we show in details how HDSmP can be 
applied on very simple different examples. So let’s examine 
the three following examples based on a simple 3D frame of 
discernment © = {01, 02,63} satisfying Shafer’s model. 


A. Example 1 
Let’s consider the following bba: 
m(0,) =0.10, m(@2) =0.17, m(63) = 0.03, 


m(0ı U 02) = 0.15, 
m(b2 U 03) = 0.05, 


m(0ı U 03) = 0.20, 

m(0ı U Ag U 03) = 0.30. 

We apply HDSmP with e = 0 in this example because there 
is no mass of belief equal to zero. It can be verified that the 
result obtained with a small positive € parameter remains (as 
expected) numerically very close to the result obtained with 
e = 0. This verification is left to the reader. 

The first step of HDSmP consists in redistributing back 
m(6, U 62 U 03) = 0.30 committed to the full ignorance 
to the elements 0; U 02, 01 U 03 and 62 U 03 only, because 
these elements are the only elements of cardinality 2 that are 
included in 6; U@2U63. Applying the Eq. (18) with n = 3, one 
gets when X (2) = 61 U 62, 0; U 03 and 6; U 02 the following 
masses 


= 0.15 + (0.3 - 0.375) = 0.2625 


because R(01 U 02, X(3)) = 0.375. 
Similarly, one gets 


= 0.20 + (0.3 - 0.5) = 0.35 

because R(01 U 03, X(3)) = CAF EO SOLOS = 0.5, and also 
= 0.05 + (0.3 - 0.125) = 0.0875 

because R(02 U 03, X(3)) = 0.125. 


= 0.15 
~~ 0.15+0.20+0.05 





= 0.05 
~~ 0.15+0.20+0.05 


Now, we go to the next step of HDSmP and one needs to 
redistribute the masses of partial ignorances X (2) correspond- 
ing to 01U62, 0, U63 and 02U0; back to the singleton elements 
X (1) corresponding to 01, 62 and 03. We use directly HDSmP 
in Eq. (17) for doing this as follows: 
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HDSmP(61) = m(61) + mp (61 U 02) - R(01, 01 U 02) 
+ Ma (01 U 63) - R(A1, 01 U 03) 
æ 0.10 + (0.2625 - 0.3703) + (0.35 - 0.7692) 
= 0.10 + 0.0972 + 0.2692 = 0.4664 





because 
0.10 
R(61, 61 U 02) = 0102017 ~ 0.3703 
0.10 
01,01 U 63) = ——————. ® 0.7692 
HOS) = ana opa 





Similarly, one gets 


HDSmP(62) = m(02) + mn (01 U 02) - R(82, 01 U 62) 
+ mn (02 U 03) - R(02, 02 U 03) 
=~ 0.10 + (0.2625 - 0.6297) + (0.0875 - 0.85) 
= 0.17 + 0.1653 + 0.0744 = 0.4097 





because 
0.17 
65,0, U 02) = ———_——. 7x 0.62 
R(O2, 61 U 92) = D554 gaz © 9.6297 
0.17 
R(O2, 02 U 63) = 0.17 + 0.03 = 0.85 
and also 


HDSmP(03) = m(03) + mn (01 U 03) - R(03, 01 U 03) 
+ Ma (02 U 03) - R(03, 02 U 03) 
= 0.03 + (0.35 - 0.2307) + (0.0875 - 0.15) 
= 0.03 + 0.0808 + 0.0131 = 0.1239 





because 
0.03 
R(63, 01 U 03) = 0104003 ~ 0.2307 
0.03 
63, 02 U 63) = ————_—— =0.1 
R(03,62U 93) = Fra ggg = O15 


Hence, the final result of HDSmP transformation is: 


HDSmP(6,) = 0.4664, 
HDSmP(63) = 0.1239. 


HDSmP(62) = 0.4097, 


and we can easily verify that 
HDSmP(6;) + HDSmP (62) + HDSmP(63) = 1. 


The procedure can be illustrated in Fig. 1 below. 























{6,,0,,0;} 
Step 1 {0,,0,} {0,,0,} {9,,0,} 
Step 2 {0} {0,5 {0} 




















Figure 1. Illustration of Example 1 





























Table I 
EXPERIMENTAL RESULTS FOR EXAMPLE 1. 

Approaches pepo 0z o3 Eq 

BetP 0.3750 0.3700 0.2550 0.9868 
PrPl 0.4045 0.3681 0.2274 0.9747 
PrBel 0.4094 0.4769 0.1137 0.8792 
DSmP_o 0.4094 0.4769 0.1137 0.8792 
DSmP_o.001 0.4094 0.4769 0.1137 0.8792 
HDSmP_o 0.4664 0.4097 0.1239 0.8921 
HDSmP_o.001 0.4664 0.4097 0.1239 0.8921 























The classical DSmP transformation [3] and the other trans- 
formations (BetP [2], PrBel and PrP1 [8]) are compared with 
HDSmP for this example in Table I. It can be seen in Table I 
that the normalized entropy Eq of HDSmP is relatively small 
but not too small among all the probability transformations 
used. In fact it is normal that the entropy drawn form HDSmP 
is a bit bigger than the entropy drawn from DSmP, because 
there is a “dilution” of uncertainty in the step-by-step redis- 
tribution, whereas such dilution of uncertainty is absent in the 
direct DSmP transformation. 


B. Example 2 


Let’s consider the following bba: 


m(61) = 0, m(62) = 0.17, m(63) = 0.13, 
m6, U 02) = 0.20, m6, U 03) = 0.20, 
m(O2 U 03) = 0, m(0ı U 4) U 03) = 0.30 


The first step of HDSmP consists in redistributing back 
m(ı U 62 U 03) = 0.30 committed to the full ignorance to 
the elements 6, U 02, and 6, U 03 only, because these elements 
are the only elements of cardinality 2 that are included in 
01 U02 U03. Applying the Eq. (18) with n = 3, one gets when 
X (2) = 6; U 62, 6; U 03 and 6; U 62 the following masses 


= 0.20 + (0.3 - 0.5) = 0.35 


because R(01 U 02, X(3)) = Tor SOTA =0.5. 
Similarly, one gets 


mp (01 U 03) = m(0ı U 03) + m(X(3)) : R(0ı U 03, X(3)) 
= 0.20 + (0.3 - 0.5) = 0.35 


because R(01 U 03, X(3)) = 0.5, and also 


= 20 
0.20+0.20+0.00 
= 0.00 + (0.3 - 0.0) = 0 


because R(62 U 63, X(3)) = EER =0. 


Now, we go to the next and last step of HDSmP principle, 
and one needs to redistribute the masses of partial ignorances 


X (2) corresponding to 6) U 62, 01 U 03 and 62 U 63 back to 
the singleton elements X (1) corresponding to 01, 02 and 63. 
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We use directly HDSmP in Eq. (17) for doing this as follows: 


HDSmP(61) = m(61) + Mma (01 U 62) - R(O1, 1 U 02) 
+ ma (01 U 03) - R(01, 01 U 03) 
= 0.00 + (0.35 - 0.00) + (0.35 - 0.00) 
= 0.00 + 0.00 + 0.00 = 0 





because 
0.00 
EO E a eG: 
Peps UPa) 0.00 4017 7°00 
0.00 
EO AE eG 
Ayn Os) 0.00 +013 7? 


Similarly, one gets 


HDSmP(62) = m(62) + Mn (01 U 62) - R(02, 01 U 62) 
+ mp (02 U 03) © R(02, 02 U 03) 
=~ 0.17 + (0.35 - 1) + (0.00 - 0.5667) 
= 0.17 + 0.35 + 0.00 = 0.52 





because 
0.17 
02, 01 U 62) = ————— = 
R(O2, 6102) = SFO 
0.17 
R(02, 03 U 03) = —————— & 0.5667 
(92,62 U 8s) 0.17 + 0.13 
and also 


HDSmP(03) = m(63) + mn (01 U 03) - R(03, 01 U 03) 
+ Ma (02 U 63) - R(03, 02 U 03) 
a~ 0.13 + (0.35 - 1) + (0.00 - 0.4333) 
= 0.13 + 0.35 + 0.00 = 0.48 





because 
0.13 
ETE E 
FUO Uhal 0.13 + 0.00 
0.13 
R(03, 0> U 83) = — == x 0.4333 
(8s, 02 U Os) 0.17 +0.13 


Hence, the final result of HDSmP transformation is: 


HDSmP(6,) = 0.4664, HDSmP(62) = 0.4097, 
HDSmP(63) = 0.1239. 


and we can easily verify that 
HDSmP(01)+ HDSmP(62) + HDSmP(63) = 1. 


The HDSmP procedure of Example 2 with e = 0 is Fig. 2. 
The HDSmP procedure of Example 2 with e€ > 0 is the same 
as that illustrated in Fig. 1. When one takes € > 0, there exist 
masses redistributed to {02 U 03}. If one takes « = 0, there 
is no mass edistributed to {02 U 03}. That’s the difference 
between Fig. 1 and Fig. 2. 


Let’s suppose that one takes e = 0.001, then the HDSmP 
calculation procedure is as follows: 
e Step 1: The first step of HDSmP consists in distributing back 
m(61 U02 U03) = 0.30 committed to the full ignorance to the 
elements 0) U 62, 0; U 03 and 62 U 63. Applying the formula 























{0,,0,,0,} 
Step1 | {8,0} {0,,0,} 
Step 2 0} (0,} {0,} 


























Figure 2. Illustration of Example 2. 


(IIT) with n = 3, one gets when X (2) = 6) U 02, 0, U 63 and 
0; U 02 the following masses 
= 0.20 + (0.3 - 0.4963) = 0.3489 
because 
0.20 + 0.001 - 3 
(0.20 + 0.001 - 3) - 2 + (0.00 + 0.001 - 3) 
= 0.4963 


R(6, U 62, X(3)) = 





mp(O1 U 63) = m(81 U 03) + m(X(3))- R(O1 U 83, X(3)) 
= 0.20 + (0.3 - 0.4963) = 0.3489 
because 
0.20 + 0.001 - 3 
(0.20 + 0.001 - 3) - 2 + (0.00 + 0.001 - 3) 
= 0.4963 


R(6, U 62, X(3)) = 





= 0.00 + (0.3 - 0.0073) = 0.0022 


because 
0.001 -3 





R(62 U 63, X (3)) = 
(42 U 63, X (3)) (0.20 + 0.001 - 3) - 2 + (0.00 + 0.001 - 3) 


= 0.0073 
e Next step: one needs to redistribute the masses of partial 
ignorances X (2) corresponding to 01 U 62, 6, U 03 and 02 U 03 
back to the singleton elements X (1) corresponding to 61, 02 
and 03. We use directly HDSmP in Eq. (17) for doing this as 
follows: 
HDSmP(6;) = m(61) + mrp(O1 U 02) = R(61, 0i U 02) 
+ Mh 0i U 03) * R(61, 1 U 03) 
= 0.00 + (0.3489 - 0.0115) + (0.3489 - 0.0149) 
= 0.00 + 0.0040 + 0.0052 = 0.0092 











because 
0.00 + 0.001 - 2 
R(O1, 01 U 02) = 
(91,01 U 82) (0.00 + 0.001 - 2) + (0.17 + 0.001 - 2) 
= 0.0115 
0.00 + 0.001 - 2 
R(O1, 01 U 03) = 
(01, 01 U 03) (0.00 + 0.001 - 2) + (0.13 + 0.001 - 2) 
= 0.0149 
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Similarly, one gets 


HDSmP(62) = m(62) + mp (01 U 62) - R(62, 01 U 02) 
+ mn (62 U 03) - R(02, 02 U 63) 
~ 0.17 + (0.3489 - 0.9885) + (0.0022 - 0.5658) 
= 0.17 + 0.3449 + 0.0012 = 0.5161 











because 
0.17 + 0.001 - 2 
R(O2, 0, U 02) = 
(02,1 U 02) (0.00 + 0.001 - 2) + (0.17 + 0.001 - 2) 
= 0.9885 
0.17 + 0.001 - 2 
R(02, 02 U 03) = 
(02, 02 U 03) (0.17 + 0.001 - 2) + (0.13 + 0.001 - 2) 
= 0.5658 


and also 


HDSmP(63) = m(63) + Mm, (01 U 03) - R(O3, 1 U 03) 
+ Ma (02 U 03) - R(03, 02 U 03) 
z~ 0.13 + (0.3489 - 0.9851) + (0.0022 - 0.4342) 
= 0.13 + 0.3437 + 0.0009 = 0.4746 











because 
0.13 + 0.001 - 2 
R(03,01 U 03) = 
(83, 01 3) (0.13 + 0.001 - 2) + (0.00 + 0.001 - 2) 
= 0.9851 
0.13 + 0.001 - 2 
R(@3, 09 U 03) = 
(03, 02 U 03) (0.17 + 0.001 - 2) + (0.13 + 0.001 - 2) 
x 0.4342 


Hence, the final result of HDSmP transformation is: 


HDSmP(6;) = 0.0092, 
HDSmP(63) = 0.4746. 


HDSmP(62) = 0.5161, 


and we can easily verify that 
HDSmP(6;) + HDSmP(62) + HDSmP(63) = 1. 


We also calculate some other probability transformations 
and the results are listed in Table II. 





























Table II 
EXPERIMENTAL RESULTS FOR EXAMPLE 2. 
Propositions 

Approaches gi 0z 03 Eq 
BetP 0.3000 0.3700 0.3300 0.9966 
PrPl 0.3125 0.3683 0.3192 0.9975 
PrBel NaN NaN NaN NaN 
DSmP_o 0.0000 0.5400 0.4600 0.6280 
DSmP_0.001 0.0037 0.5381 0.4582 0.6479 
HDSmP_o 0.0000 0.5200 0.4800 0.6302 
HDSmP_o.001 0.0092 0.5161 0.4746 0.6720 























It can be seen in Table II that the normalized entropy Ey 
of HDSmP is relatively small but not too small among all the 
probability transformations used. 
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C. Example 3 
Let’s consider the following bba: 
m(41) z 0, m(02) = 0, m(83) = 0.70, 
m(0ı U 62) — 0, m(0ı U 63) = 0, 
m(O2 U 63) = 0, m(0ı U Ag U 63) = 0.30 


In this example, the mass assignments for all the focal ele- 
ments with cardinality size 2 equal to zero. For HDSmP, when 
€ > 0, m(02 U 03) will be divided equally and redistributed to 
{01 U 02}, {01 U 03} and {02 U 63}. Because the ratios are 


R(0ı U 02, X(3)) = R(0ı U 63, X(3)) = R(02 U 63, X(3)) 
0.00 + 0.001 - 3 


= = 0.3333 
(0.00 + 0.001 - 3) -3 


One sees that with the parameter € = 0, HDSmP cannot be 
computed (division by zero) and that is why it is necessary to 
use € > 0 in such particular case. The results of HDSmP and 
other probability transformations are listed in Table II. 
































Table III 
EXPERIMENTAL RESULTS FOR EXAMPLE 3. 
Propositions 

Approaches 01 De 03 Eq 
BetP 0.1000 0.1000 0.8000 0.5871 
PrPl 0.0562 0.0562 0.8876 0.3911 
PrBel NaN NaN NaN NaN 
DSmP_o 0.0000 0.0000 1.0000 0.0000 
DSmP_0.001 0.0004 0.0004 0.0092 0.0065 
HDSmP_o NaN NaN NaN NaN 
HDSmP_o.001 0.0503 0.0503 0.8994 0.3606 























It can be seen in Table II that the normalized entropy Ey 
of HDSmP is relatively small but not the smallest among all 
the probability transformations used. Naturally, and as already 
pointed out, HDSmP,=o cannot be computed in such example 
because of division by zero. But with the use of the parameter 
€ = 0.001, the mass of m(60,U6@2U63) becomes equally divided 
and redistributed to the focal elements with cardinality of 2. 
This justify the necessity of the use of parameter € > 0 in 
some particular cases when there exist masses equal to zero. 


D. Example 4 (vacuous bba) 


Let’s consider the following particular bba, called the vacu- 
ous bba since it represents a fully ignorant source of evidence: 


m(0ı)=0, m(#2)=0, m(43) = 0, 
m(64 U 02) = 0, m(6, U 03) = 0, 
m(0z U 83) = 0, m(6, U 62 U 03) =1 


In this example, the mass assignments for all the focal ele- 
ments with cardinality less than 3 equal to zero. For HDSmP, 
when € > 0, m(@1 U 62 U 03) will be divided equally and 
redistributed to {0; U 02}, {01 U 03} and {02 U 63}. Similarly, 
the mass assignments for focal elements with cardinality of 
2 (partial ignorances) obtained at the intermediate step will 
be divided equally and redistributed to singletons included in 
them. This redistribution is possible for the existence of € > 0 
in HDSmP formulas. HDSmP cannot be applied and computed 
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in such example if one takes € = 0, and that is why one needs 
to use € > 0 here. The results of HDSmP and other probability 
transformations are listed in Table IV. 





























Table IV 
EXPERIMENTAL RESULTS FOR EXAMPLE 4. 
Propositions 

Approaches a, Bp 03 Ex 
BetP 0.3333 0.3333 0.3333 1.0000 
PrPl 0.3333 0.3333 0.3333 1.0000 
PrBel NaN NaN NaN NaN 
DSmP_o NaN NaN NaN NaN 
DSmP_0.001 0.3333 0.3333 0.3333 1.0000 
HDSmP_o NaN NaN NaN NaN 
HDSmP_o.001 0.3333 0.3333 0.3333 1.0000 























It can be seen in Tables I — IV that the normalized entropy 
Ey of HDSmP is always moderate among the other probability 
transformations it is compared with, and it is normal to get an 
entropy value with HDSmP bigger than with DSmP because of 
dilution of uncertainty through the procedure of HDSmP. We 
have already shown that the entropy criteria is not enough in 
fact to evaluate the quality a probability transformation [14], 
and always a compromise must be found between entropy level 
and numerical robustness of the transformation. Although the 
entropy should be as small as possible for decision-making, 
exaggerate small entropy is not always preferred. Because of 
the way the mass of (partial) ignorances is proportionally 
redistributed, it is clear that if the mass assignment for a 
singleton equals to zero in the original bba, then after applying 
DSmP or HDSmP transformations this mass is unchanged and 
is kept to zero. This behavior may appear a bit intuitively 
surprising at the first glance specially if some masses of partial 
ignorances including this singleton are not equal to zero. 
This behavior is however normal in the spirit of proportional 
redistribution because one wants to reduce the PIC value so 
that if one has no strong support (belief) in a singleton in 
the original bba, we expect also to have no strong support in 
this singleton after the transformation is applied which makes 
perfectly sense. Of course if such behavior is considered as 
too optimistic or not acceptable because it appears too risky 
in some applications, it is always possible to choose another 
transformation instead. The final choice is always left in the 
hands of the user, or the fusion system designer. 


V. CONCLUSIONS 


Probability transformation is very crucial for decision- 
making in evidence theory. In this paper a novel interesting and 
useful hierarchical probability transformation approach called 
HDSmP has been proposed, and HDSmP always provides a 
moderate value of entropy which is necessary for an easier 
and reliable decision-making support. Unfortunately the PIC 
(or entropy) level is not the unique useful criterion to evaluate 
the quality of a probability transformation in general. At least 
the numerical robustness of the method is also important and 
must be considered seriously as already shown in our previous 
works. Therefore, to evaluate any probability transformation 
more efficiently and to outperform existing transformations 


(including DSmP and HDSmP) a more general comprehensive 
evaluation criteria need to be found. The search for such a 
criteria is under investigations. 
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Neutrosophic Masses & Indeterminate Models 


Applications to Information Fusion 


Florentin Smarandache 


Abstract—In this paper we introduce the indeterminate models in 
information fusion, which are due either to the existence of some 
indeterminate elements in the fusion space or to some 
indeterminate masses. The best approach for dealing with such 
models is the neutrosophic logic. 


Keywords: neutrosophic logic; indeterminacy; indeterminate 
model; indeterminate element; indeterminate mass; indeterminate 
fusion rules; DSmT; DST; TBM; 


I. 


In this paper we introduce for the first time the notions of 
indeterminate mass (bba), indeterminate element, indeterminate 
intersection, and so on. We give an example of neutrosophic 
dynamic fusion using two classical masses, defined on a 
determinate frame of discernment, but having indeterminate 
intersections in the super-power set S ~ (the fusion space). We 
also adjust several classical fusion rules (PCRS and DSmH) to 
work for indeterminate intersections instead of empty 
intersections. 


INTRODUCTION 


References [3]-[13] show a wide variety of applications of 
the neutrosophic logic and set, based on indeterminacy, in 
information technology. 


Let © be a frame of discernment, defined as: 


© = {pr pn.. Gn, n= 2, (1) 
and its Super-Power Set (or fusion space): 
S° =(0.U,.0.0) (2) 


which means the set © closed under union, intersection, and 
respectively complement. 


This paper is organized as follows: we present the 
neutrosophic logic, the indeterminate masses, elements and 
models, and give an example of indeterminate intersection. 
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Originally published as Smarandache F., Neutrosophic Masses & 
Indeterminate Models, in Proc. of Fusion 2012, Singapore, July 
2012, and reprinted with permission. 


Il. INDETERMINATE MASS 


A. Neutrosophic Logic 

Neutrosophic Logic (NL) [1] started in 1995 as a 
generalization of the fuzzy logic, especially of the intuitionistic 
fuzzy logic. A logical proposition P is characterized by three 
neutrosophic components: 


NL(P) =(T, I, F) (3) 


where T is the degree of truth, F the degree of falsehood, and 7 
the degree of indeterminacy (or neutral, where the name 
“neutro-sophic” comes from, i.e. neither truth nor falsehood but 
in between — or included-middle principle), and with: 


DLE Gory (4) 


where /°0,/ T is a non-standard interval. 


In this paper, for technical proposal, we can reduce this interval 
to the standard interval /0, 17. 


The main distinction between neutrosophic logic and 
intuitionistic fuzzy logic (IFL) is that in NL the sum T+/+F of 
the components, when T, J, and F are crisp numbers, does not 
need to necessarily be / as in IFL, but it can also be less than / 
(for incomplete/missing information), equal to / (for complete 
information), or greater than / (for paraconsistent/contradictory 
information). 


The combination of neutrosophic propositions is done using the 
neutrosophic operators (especially A, V ). 


B. Neutrosophic Mass 
We recall that a classical mass m/(.) is defined as: 


m:S° — [0,1] (5) 


such that 


$ m(X)=1 


Xes? 


(6) 
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We extend this classical basic belief assignment (mass) m(.) 


to a neutrosophic basic belief assignment (nbba) (or 
neutrosophic mass) m,(.) in the following way. 
m, : S° — [0,1] (7) 
with 
m,(A) = (T(A), I(A), F(A) (8) 


where 7(A) means the (local) chance that hypothesis A occurs, 
F(A) means the (local) chance that hypothesis A does not occur 
(nonchance), while /(A) means the (local) indeterminate chance 
of A (i.e. knowing neither if A occurs nor if A doesn’t occur), 


such that: 


X (P(X) 41(X)+ F(X] = 1. 


Xes® 


(9) 


In a more general way, the summation (9) can be less than 1 
(for incomplete neutrosophic information), equal to 1 (for 
complete neutrosophic information), or greater than 1 (for 
paraconsistent/conflicting neutrosophic information). But in 
this paper we only present the case when summation (9) is 
equal to 1. 


Of course, 
0 <7(A),/(A), F(A) <1. (10) 
A basic belief assignment (or mass) is considered 


indeterminate if there exist at least an element A € S° such 
that [(A) > 0, i.e. there exists some indeterminacy in the chance 
of at least an element A for occurring or for not occurring. 
Therefore, a neutrosophic mass which has at least one element 
A with /(A) > 0 is an indeterminate mass. 


A classical mass m(.) as defined in equations (5) and 
(6) can be extended under the form of a neutrosophic mass 
m, (.) in the following way: 


m,':S° — [0,1] 


with 


(11) 


m, '(A) = (m(A), 0, 0) (12) 


but reciprocally it does not work since J(A) has no 
correspondence in the definition of the classical mass. 


We just have 7(A) = m(A) and F(A) = m(C(A)), where C(A) is 
the complement of A. The non-null /(A) can, for example, be 
roughly approximated by the total ignorance mass m(@ ), or 
better by the partial ignorance mass m(®,) where ©, is the 


union of all singletons that have some non-zero indeterminacy, 
but these mean less accuracy and less refinement in the fusion. 


If IX) = 0 for all X ES S3 then the neutrosophic mass is 
simply reduced to a classical mass. 
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HI. INDETERMINATE ELEMENT 


We have two types of elements in the fusion space S a 
determinate elements (which are well-defined), and 
indeterminate elements (which are not well-defined; for 
example: a geographical area whose frontiers are vague; or 
let’s say in a murder case there are two suspects, John — who is 
known/determinate element — but he acted together with 
another man X (since the information source saw John together 
with an unknown/unidentified person) — therefore X is an 
indeterminate element). 


Herein we gave examples of singletons as indeterminate 
elements just in the frame of discernment © , but 


indeterminate elements can also result from the combinations 
(unions, intersections, and/or complements) of determinate 


elements that form the super-power set S ° | For example, A 
and B can be determinate singletons (we call the elements in 


© as singletons), but their intersection A M B can be an 
indeterminate (unknown) element, in the sense that we might 
not know if AV B=@ or ANB @. 


Or A can be a determinate element, but its complement 
C(A) can be indeterminate element (not well-known), and 
similarly for determinate elements A and B, but their 4 U B 
might be indeterminate. 


Indeterminate elements in S° can, of course, result from 
the combination of indeterminate singletons too. All depends 
on the problem that is studied. 


A frame of discernment which has at least an indeterminate 
element is called indeterminate frame of discernment. 
Otherwise, it is called determinate frame of discernment. 


Similarly we call an indeterminate fusion space (S 25 that 
fusion space which has at least one indeterminate element. Of 
course an indeterminate frame of discernment spans an 
indeterminate fusion space. 


An indeterminate source of information is a source which 
provides an indeterminate mass or an indeterminate fusion 
space. Otherwise it is called a determinate source of 
information. 


IV. INDETERMINATE MODEL 


An indeterminate model is a model whose fusion space is 
indeterminate, or a mass that characterizes it is indeterminate. 


Such case has not been studied in the information fusion 
literature so far. In the next sections we’ll present some 
examples of indeterminate models. 


V. 


In the classical fusion theories all elements are considered 
determinate in the Closed World, except in Smets’ Open World 
where there is some room (i.e. mass assigned to the empty set) 
for a possible unknown missing singleton in the frame of 
discernment. So, the Open World has a probable indeterminate 
element, and thus its frame of discernment is indeterminate. 
While the Closed World frame of discernment is determinate. 


CLASSIFICATION OF MODELS 
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In the Closed World in Dezert-Smarandache Theory there 
are three models classified upon the types of singleton 
intersections: Shafer’s Model (where all intersections are 
empty), Hybrid Model (where some intersections are empty, 
while others are non-empty), and Free Model (where all 
intersections are non-empty). 


We now introduce a fourth category, called Indeterminate 
Model (where at least one intersection is 
indeterminate/unknown, and in general at least one element of 
the fusion space is indeterminate). We do this because in 
practical problems we don’t always know if an intersection is 
empty or nonempty. As we still have to solve the problem in 
the real time, we have to work with what we have, i.e. with 
indeterminate models. 


The indeterminate intersection cannot be refined (because 
not knowing if AQ B is empty or nonempty, we’d get two 
different refinements: {4, B} when intersection is empty, and 
{A\B, B\A, AQA B} when intersection is nonempty). 


The percentage of indeterminacy of a model depends on the 
number of indeterminate elements and indeterminate masses. 


By default: the sources, the masses, the elements, the 
frames of discernment, the fusion spaces, and the models are 
supposed determinate. 


VI. AN EXAMPLE OF INFORMATION FUSION WITH AN 


INDETERMINATE MODEL 
We present the below example. 


Suppose we have two sources, m;(.) and m2(.), such that: 












































A B G AUBUC AQB ANAC BOAC 
Ind. o Ind. 
mı 0.4 0.2 0.3 0.1 
M2 0.1 0.3 0.2 0.4 
Miz .21 17 .20 .04 .14 ad .13 
Table 1 


Applying the conjunction rule to m; and m we get mj.) as 
shown in Table 1. 


The frame of discernment is © = {4, B, C}. We know that 
AM Cis empty, but we don’t know the other two intersections: 
we note them as ANB = ind. and BNC = ind,. where ind. 
means indeterminate. 


Using the Conjunctive Rule to fusion m; and mz, we get m12(.): 


VA eS°\g,m(A)= >) m(X)m(¥). (13) 
YESS 
A=XOY 

Whence: m(4)=0.21, m:(B)=0.17, m;2(C)=0.20, 


mp(A I BU C)=0.04, and for the intersections: 
mp(A O B)=0.14, m1(A A C)=0.11, m (BA C)=0.13. 
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We then use the PCRS fusion rule style to redistribute the 
masses of these three intersections. We recall PCRS for two 
sources: 


(14) 
VAe S°\¢, 
mA) m{X) , mA)’ m(X) 





Muvces(_A) S mu( A) + y 


a) mı({ANC)=0.11 is redistributed back to 4 and C 
because ANC= @, according to the PCRS style. 


m(A)+m{X)  m(A)+m(X) 


Let al and a2 be the parts of mass 0.// redistributed back to 
A, and y/ and y2 be the parts of mass 0.// redistributed back to 
C 


We have the following proportionalizations: 


A 0402. © oiana za. 
04 02 0.44+0.2 
whence al = 0.4(0.133333) = 0.053333 
and y1 = 0.2(0.13333) = 0.026667. 
Similarly: 

a2 y2 0.1-03 


=- = 0.075, 
0.1 0.3 0.1+0.3 

whence a2 = 0.1(0.075) = 0.0075 

and y2 = 0.3(0.075) = 0.0225. 

Therefore the mass of A, which can also be noted as T(A) in a 
neutrosophic mass form, receives from 0.11 back: 

al+a2 = 0.053333+0.0075 = 0.060833, 

while the mass of C, or 7(C) in a neutrosophic form, receives 
from 0.11 back: 

y1+y2 = 0.026667+0.0225= 0.049167. 

We verify our calculations: 0.060833+0.049167=0.11. 

b) m)(ANB)=0.14 is redistributed back to the 
indeterminate parts of the masses of A and B respectively, 
namely /(A) and I(B) as noted in the neutrosophic mass form, 
because ANB=Ind. We follow the same PCRS style as done in 
classical PCRS for empty intersections (as above). 

Let a3 and a4 be the parts of mass 0.14 redistributed back to 
I(A), and f/ and f2 be the parts of mass 0./4 redistributed 
back to I(B). 

We have the following proportionalizations: 


a fl _ 04-03 6 1499 


04 03 0.44+0.3 
whence a3 = 0.4(0.171429) = 0.068572 
and B1 = 0.3(0.171429) = 0.051428. 
Similarly: 


a4 B2_ 0.1-0.2 
0.1 0.2 0.1+0.2 


whence a4 = 0.1(0.066667) = 0.006667 
and B2 = 0.2(0.066667) = 0.013333. 


Therefore, the indeterminate mass of A, /(A) receives from 
0.14 back: 


a3+ a4 = 0.068572+0.006667=0.075239 














= 0.066667 
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and the indeterminate mass of B, I(B), receives from 0.14 
back: 


B1+ p2 = 0.051428+0.013333=0.064761. 


c) Analougously, m).(BNC)=0.13 is redistributed back 
to the indeterminate parts of the masses of B and C 
respectively, namely /(B) and I(C) as noted in the neutrosophic 
mass form, because B.C=Ind. also following the PCRS style. 
Whence /(B) gets back 0.065 and (C) also gets back 0.065. 

Finally we sum all results obtained from firstly using the 
Conjunctive Rule [Table 1] and secondly redistributing the 
intersections masses with PCRS [sections a), b), and c) from 
above]: 












































TA) | TB) | TO) | O) | WH I(B) \(C) 
mp 21 .17 .20 .04 
addi- .0075 .022 .068 .051 .04 
tions .053 5 572 428 .045 
333 .026 .006 .013 
667 667 333 
.02 
.045 
M12PCR51 .270 .17 .249 .04 .075 .129 .065 
833 167 239 761 
Table 2 


where ® = A U BU Cis the total ignorance. 


VII. BELIEF, DISBELIEF, AND UNCERTAINTY 


In classical fusion theory there exist the following functions: 


Belief in A with respect to the bba m(.) is: 


Bel(A)= >) mX) 


XES°\ {gp} 
XCA 


(15) 


Disbelief in A with respect to the bba m(.) is: 


Dis(A)= >) m(X) 
Xese} 
XOA=¢ 


Uncertainty in A with respect to the bba m(.) is: 


U(A)= }, m(X), 


(16) 


(17) 


XAC(A)#p 
where C(A) is the complement of A with respect to the total 
ignorance ©. 

Plausability of A with respect to the bba m(.) is: 
P(A)= > m(X) 


XeS°\{g} 
X OAD 


(18) 
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VIII. NEUTROSOPHIC BELIEF, NEUTROSOPHIC DISBELIEF, AND 
NEUTROSOPHIC UNDECIDABILITY 


Let’s consider a neutrosophic mass m,(.) as defined in 
formulas (7) and (8), m,(X) = (T(X), I(X), F(X)) for all 


XeS°. 
We extend formulas (15)-(18) from m(.) to m,(.): 


Neutrosophic Belief in A with respect to the nbba m,(.) is: 


NeutBel(A)= X T(X)+ Yo F(X) a9 
XeS\(g} Xes? g} 
XSA XnA=$ 


Neutrosophic Disbelief in A with respect to the nbba m,(.) 
is: 


NeutDis(A)= X, T(X)+ >) F(X) (20) 
XeS\g} Xes g} 
XnA=ġ XcA 


Neutrosophic Uncertainty in A with respect to the nbba 


m,(.) is 
NeutU(A)= >) T(X)+ } F(X) 
XeS°\{g} XeS\\9} 
Petes TAC Cis 
A (21) 
= $, TØFF) 
XeS°\{g} 
XOAFP 
XAC(A}¥d 
We now introduce the Neutrosophic Global 


Indeterminacy in A with respect to the nbba m,(.) as a sum of 
local indeterminacies of the elements included in A: 


NeutGlobInd(A) = > I(X) 


XES°\ fg} 
XCA 


(22) 


And afterwards we define another function called 
Neutrosophic Undecidability about A with respect to the 


nbba m,(.): 
NeutUnd(A) = NeutU(A) + NeutGlobInd(A) (23) 
or 
(24) 

NeutUnd(A)= >) [T(X)+F(XY)]+ DY U(X) 
XeS?\fg} XeES°\{G} 
XnAeg XCA 
XAC(Aeo 


Neutrosophic Plausability of A with respect to the nbba 


m,(.) is: 
NeutPl(A)= >) T(X)+ >> FY) o5 
Xeste} YeS°\{g} 
Xn Ath CY) A#e 
In the previous example let’s compute NeutBel(.), 


NeutDis(.), and NeutUnd(.): 
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A B C AIBU C 
NeutBel 0.270833 0.17 0.249167 0.73 
NeutDis 0.419167 0.52 0.440833 0 
NeutGlobInd 0.115239 0.169761 0.105 0 
Total 0.805239 0.859761 0.795 0.73 
+ + + + 
1 1 1 1 
Table 3 


As we see, for indeterminate model we cannot use the 
intuitionistic fuzzy set or intuitionistic fuzzy logic since the 
sum NeutBel(X)+NeutDis(X)+NeutGlobInd(X) is less than 1. 
In this case we use the neutrosophic set or logic which can 
deal with incomplete information. 

The sum is less than 1 because there is missing information 
(we don’t know if some intersections are empty or not). 

For example: 
NeutBel(4)+NeutDis(4)+NeutGlobInd(4)=0.805239 
=1-1(B)-I(C). 
Similarly, 
NeutBel(B)+NeutDis (B)+NeutGlobInd(B)=0.859761 
=1-I(A)-I(C). 
NeutBel(C)+NeutDis(C)+NeutGlobInd(C)=0.795 
=1-1(A)-I(B) 
and 
NeutBel(A UV BU C)+NeutDis(AU BUC) 
+NeutGlobInd(A VU BU C)=0.73=1-I(A)-I(B)-l(O). 








IX. NEUTROSOPHIC DYNAMIC FUSION 


A Neutrosophic Dynamic Fusion is a dynamic fusion 
where some indeterminacy occurs: with respect to the mass or 
with respect to some elements. 

The solution of the above indeterminate model which has 
missing information, using the neutrosophic set, is consistent 
in the classical dynamic fusion in the case we receive part (or 
total) of the missing information. 

In the above example, let’s say we find out later in the 
fusion process that ANB = Q. That means that the mass of 


indeterminacy of A, [(A)=0.075239, is transferred to A, and 
the masses of indeterminacy of B (resulted from 4;p only) - 


i.e. 0.051428 and 0.13333 - are transferred to B. We get: 


The sum NeutBel(X)+NeutDis(X)+NeutBlogInd(X) increases 
towards 1, as indeterminacy /(X) decreases towards 0, and 
reciprocally. 

When we have complete information we get 
NeutBel(X)+NeutDis(X)+NeutGlobInd(X)=1 and in this case 
we have an intuitionistic fuzzy set, which is a particular case 
of the neutrosophic set. 

Let’s suppose once more, considering the neutrosophic 
dynamic fusion, that afterwards we find out that BOC Ø. 
Then, from Table 4 the masses of indeterminacies of B, I(B) 
(0.065 = 0.02 + 0.045, resulted from BC which was 
considered indeterminate at the beginning of the neutrosophic 
dynamic fusion), and that of C, [(C)=0.065, go now to 
BOC. Thus, we get: 


































































































A B c © | 4 | 1) |o fasBfac 
m | .270 | .17 | .249 | .04 o | 065 | 065 | o 0 
833 167 
+ | .075 | .051 
239 | 428 
013 
333 
mn | .346 | .234 | .249 | .04 o | 065 | .065 | o 0 
072 | 761 | 167 
Table 4 


where © =A U BU Cis the total ignorance. 
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A B C © I(A) I(B) I(C) AB AC BC 

mn .346 | .234 | .249 .04 0 .065 | .065 0 0 0 
072 761 167 

-/+ -.0 -.0 +.0 

65 65 65 

+.0 

65 

Mnn | .346 | .234 | .249 .04 0 0 0 0 0 13 
072 761 167 

Table 5 


X. MORE REDISTRIBUTION VERSIONS FOR INDETERMINATE 
INTERSECTIONS OF DETERMINATE ELEMENTS 


Besides PCRS, it is also possible to employ other fusion 
rules for the redistribution, such as follows: 
a. For the masses of the empty intersections we can use 


PCRI-PCR4, URR, PURR, Dempster’s Rule, etc. (in 
general any fusion rule that first uses the conjunctive 
rule, and then a redistribution of the masses of empty 
intersections). 

For the masses of the indeterminate intersections we 
can use DSm Hybrid (DSmH) rule to transfer the 


mass M(X QY =ind.) to XOY , 


X UY is a kind of uncertainty related to X, Y. In 
our opinion, a better approach in this case would be 
to redistributing the empty intersection masses using 
the PCRS and the indeterminate intersection masses 
using the DSmH, so we can combine two fusion rules 


since 


into one: 


Let m,(.) and m2(.) be two masses. Then: 
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X m(X)mA(Y) 
X,YeS°\{g} 
XAY=A 


m( A) m(X) 4 m(A)’mi(X) 
mi(A)+m.(X ) mA)+m(X) 


Mi2Pcrs | psn A) = 


2 


Xese o 
XOA=¢ 


+ $ m(X)m.(Y) 
X Yes \{g} 

XOY=ind. 

XUY=A 


mi(X )m(Y) 


XYeS®\{g} py 
{XAY=A}V{(X AY =ind.)A(XUY=A)} 

m( A) m(X) $ m({A}m(X) 
XeS\(g} mi(A)+m(X ) mA)+m(X) 


XOA=$ 





(26) 


Yet, the best approach, for an indeterminate intersection 
resulted from the combination of two classical masses m;(.) 
and m(.) defined on a determinate frame of discernment, is 
the first one: 

Use the PCRS to combine the two sources: formula 
(14). 

Use the PCR5-ind [adjusted from classical PCRS 
formula (14)] in order to compute the indeterminacies 


of each element involved in indeterminate 
intersections : 
VAeS°\¢, 
Mm AY m mh Ay m(X 
Moversm(T(A)) = >, A J MA mX) 
Xes mi(A) + m(X) mA) + m(X) 
NA=ind. 


(27) 
Compute NeutBel(.), NeutDis(.), NeutGlobInd(.) of 
each element. 


CONCLUSION 


In this paper we introduced for the first time the 
notions of indeterminate mass (bba), indeterminate element, 
indeterminate intersection, and so on. We gave an example of 
neutrosophic dynamic fusion using two classical masses, 
defined on a determinate frame of discernment, but having 


indeterminate intersections in the super-power set S ~ (the 
fusion space). We adjusted several classical fusion rules (PCR5 
and DSmH) to work for indeterminate intersections instead of 
empty intersections. 
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Then we extended the classical Bel(.), Dis(.) {also 
called Dou(.), i.e Dough} and the uncertainty U(.) functions to 
their respectively neutrosophic correspondent functions that use 
the neutrosophic masses, i.e. to the NeutBel(.), NeutDis(.), 
NeutU(.) and to the undecidability function NeutUnd(.) . We 
have also introduced the Neutrosophic Global Indeterminacy 
function, NeutGlobInd(.), which together with NeutU(.) form 
the NeutUnd(.) function. 


In our first example the mass of ANB is determined (it 
is equal to 0.14), but the element ANB is indeterminate (we 
don’t know if it empty or not). 


But there are cases when the element is determinate (let’s say a 
suspect John), but its mass could be indeterminate as given by a 
source of information {for example m,(John) = (0.4, 0.1, 0.2), 
i.e. there is some mass indeterminacy: [(John) = 0.2 > 0}. 


These are the distinctions between the indeterminacy of an 
element, and the indeterminacy of a mass. 
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Extended PCR Rules for Dynamic Frames 


Florentin Smarandache 
Jean Dezert 


Abstract—In most of classical fusion problems modeled from 
belief functions, the frame of discernment is considered as static. 
This means that the set of elements in the frame and the 
underlying integrity constraints of the frame are fixed forever 
and they do not change with time. In some applications, like in 
target tracking for example, the use of such invariant frame is 
not very appropriate because it can truly change with time. So it 
is necessary to adapt the Proportional Conflict Redistribution 
fusion rules (PCR5 and PCR6) for working with dynamical 
frames. In this paper, we propose an extension of PCR5 and 
PCR6 rules for working in a frame having some non-existential 
integrity constraints. Such constraints on the frame can arise in 
tracking applications by the destruction of targets for example. 
We show through very simple examples how these new rules can 
be used for the belief revision process. 


Keywords: Information fusion, DSmT, integrity con- 
straints, belief functions. 


I. INTRODUCTION 


In most of classical fusion problems using belief functions, 
the frame of discernment © = {0),62,...,0,} is considered 
static. This means that the set of elements in the frame 
(assumed to be non-empty and distinct) and the underlying 
integrity constraints of the frame! are fixed and they do not 
change with time. In some applications however, like in target 
tracking and battlefield surveillance for example, the use of 
such invariant frame is not very appropriate because it can 
truly change with time depending on the evolution of the 
events. So it is necessary to adapt the Proportional Conflict 
Redistribution fusion rules (PCRS and PCR6) for working 
with dynamical frames. In this paper, we study in details how 
to work with PCRS or PCR6 fusion rules in a dynamical 
frame subject to non-existential integrity constraint, when one 
or several elements of the frame disappear. This phenomena 
can occur in some applications, specially in defense and 
battlefield surveillance when foe targets (considered as element 
of the frame) can be shot and entirely destroyed and the 
initial belief one has on threat assessment must be revised 
according to the knowledge one has on this new fact obtained 
from intelligence services or observations systems. We show 
through very simple examples how this problem can be solved 
using PCR principle. 

Example 1: Let’s consider the set of three targets at a given 
time k to be Ox = {01,02,03} with 0; Æ Ø, i = 1,2,3 and 
assume that ©; satisfies Shafer’s model (i.e. the targets are all 


!This is also called the model for © which can correspond to DSm free, 
DSm hybrid or Shafer’s models in DSmT framework [4]. 


Originally published as Smarandache F., Dezert J., Extended 
PCR Rules for Dynamic Frames, Proc. Of Fusion 2012, 
Singapore, July 2012, and reprinted with permission. 


distinct and exhaustive) and we work with normalized bba’s. 
Suppose one has two basic belief assignments (bba) mı (.) and 
mg(.) defined with respect to the power-set of Ox given by 
two distinct sources of evidence to characterize their beliefs 
in the most threatening target. Let’s assume that one receives 
at k + 1 a new information confirming that one target, say 
target 03, has been destroyed. The problem one needs to solve 
is how to combine efficiently m1(.) and mo(.) taking into 
account this new non-existential integrity constraint 03 = @) in 
the new model of the frame to establish the most threatening 
and surviving targets belonging to 0,11 = {01,0}. 

The contribution of this paper is to propose a solution 
to such kind of belief revision problem involving dynamical 
frames including non-existential constraints on some of its 
elements. This paper is organized as follows. In section 1, we 
briefly recall the basis of DSmT (Dezert-Smarandache Theory) 
[4] and its main rule of combination (PCR5 and PCR6) for 
the fusion of bba’s in a static frame. In section 2, we present 
an improvement/adaptation of PCR rules to work on frames 
with non-existential constraints (dynamical frames). In section 
3, we apply our method on some examples. Conclusions are 
then given in section 4. 


II. BASICS OF DSMT 


The purpose of the development of Dezert-Smarandache 
Theory (DSmT) [4] is to overcome the limitations of 
Dempster-Shafer Theory (DST) [3] mainly by proposing new 
underlying models for the frames of discernment in order to 
fit better with the nature of real problems, and by proposing 
new efficient combination and conditioning rules. In DSmT 
framework, the elements 6;, i = 1,2,...,n of a given 
frame © are not necessarily exclusive, and there is no re- 
striction on 0; but their exhaustivity. The hyper-power set 
D® in DSmT, the hyper-power set is defined as the set of 
all composite propositions built from elements of © with 
operators U and N. For instance, if © = {61,02}, then 
DE = {0, 01, 02, 01 N 02,01 U 02}. The hyper-power set DE 
reduces to classical power-set 2° as soon as we assume 
exclusivity between the elements of the frame (this is Shafer’s 
model). A (generalized) basic belief assignment (bba for short) 
is defined as the mapping m : DE — [0,1]. The generalized 
belief and plausibility functions are defined in almost the same 
manner as in DST. More precisely, from a general frame O, 
we define a map m(.) : DE — [0,1] associated to a given 
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body of evidence $ as 


m(0)=0 and X m(A) =1 (1) 


AED? 


The quantity m(A) is called the generalized basic belief 
assignment/mass (or just ”bba” for short) of A. 

The generalized credibility and plausibility functions are de- 
fined in almost the same manner as within DST, i.e. 


Bel(A)= X` m(B) and P(A)= S > m(B) (2) 
BCA BnA#o 
BEDO BED? 

Two models? (the free model and hybrid model) in DSmT can 
be used to define the bba’s to combine. In the free DSm model, 
the sources of evidence are combined without taking into 
account integrity constraints. When the free DSm model does 
not hold because the true nature of the fusion problem under 
consideration, we take into account some known integrity 
constraints? and define bba’s to combine using the proper 
hybrid DSm model. Aside offering the possibility to work with 
different underlying models (not only Shafer’s model as within 
DST), DSmT offers also new efficient combination rules based 
on proportional conflict redistribution (PCR rules no 5 and no 
6) for combining highly conflicting sources of evidence. PCR5 
transfers the conflicting mass only to the elements involved in 
the conflict and proportionally to their individual masses, so 
that the specificity of the information is entirely preserved in 
this fusion process. (see [4], Vol. 2 for full justification and 
examples): mpors(0) = 0 and YX € D® \ {0} 





mpcrs(X) = 5 mı(Xı)mə(X2)+ 
m m) mə(X)?mı (X2) 
oe mO + ma) * mO Fm © 
XenX=0 


where all denominators in (3) are different from zero. If a 
denominator is zero, that fraction is discarded. The prop- 
erties of PCR5 can be found in [2]. Extension of PCR5 
for combining qualitative bba’s can be found in [4], Vol. 
2 & 3. All propositions/sets are in a canonical form. A 
variant of PCR5, called PCR6 has been proposed by Martin 
and Osswald in [4], Vol. 2, for combining s > 2 sources. 
The general formulas for PCR5 and PCR6 rules are given 
in [4], Vol. 2 also. PCR6 coincides with PCR5 when one 
combines two sources. The difference between PCRS and 
PCR6 lies in the way the proportional conflict redistribution 
is done as soon as three or more sources are involved in 
the fusion. From the implementation point of view, PCR6 is 
much more simple to implement than PCRS. For convenience, 
very basic (not optimized) Matlab codes of PCRS and PCR6 
fusion rules can be found in [4], [5] and from the toolboxes 
repository on the web [7]. In DSmT framework, the classical 


? Actually, Shafer’s model, considering all elements of the frame as truly 
exclusive, can be viewed as a special case of hybrid model. 
3but non-existential integrity constraints as shown in Example 2. 


pignistic transformation BetP(.) is replaced by the more 
effective DSmP(.) transformation to estimate the subjective 
probabilities of hypotheses for decision-making support once 
the combination of bba’s has been done if compromise attitude 
is chosen. The max of credibility (pessimistic decision attitude) 
or max of plausibility (optimistic decision attitude) are also 
possible depending on the preference of decision maker. This 
topic is out of the scope of this paper and readers interested 
in decision-making based on DSmP must refer to [4], Vol.3 
freely available on the web. 


III. WORKING WITH NON-EXISTENTIAL CONSTRAINTS 


In this section we show how this problem can be solved 
from the classical Shafer’s approach and then we show how 
it can be solved with PCR rules to get more specific results. 


A. Shafer’s approach 


Let’s consider a finite and discrete frame O, = 
{61,0,...,n} satisfying Shafer’s model with all 0; 4 @ at 
a given time k, and two bba’s m1,,(.) and m2,x(.) provided 
by two distinct sources of evidences. Each bba is defined in 
the power set 29+., Let’s assume now that at time k + 1 extra 
knowledge is given about the non-existence of some elements 
of Ox. We denote such non-existential constraint as NE (the 
set of Non Existing elemnts). For example, if NE,;1 = {01} 
means that actually 6; = 0, NEx+1 = {01, 02} means that both 
0, = @ and 65 = @, and so on. The new frame of discernment 
we have to work with is then given by 0,41; = Op \ NExi1. 
The question is how to combine at time k + 1 the two original 
bba’s m1,;,(.) and m2,x(.) one had in taking into account our 
knowledge on the revised frame 0,41 obtained from ©; and 
NEx+1 ? 

Dempster-Shafer Theory (DST) [3] offers a mathematical 
tool for answering to this question: Dempster-Shafer belief 
conditioning rule (DSCR) which consists in combining with 
Dempster-Shafer’s rule the prior bba m(.) with the condition- 
ing bba me(.) which is only focused on the conditioning event 
X, ie. for which m,(X) = 1. Mathematically, mpg(.|X) is 
then defined* by 


mps(.|X) = [m®m_](.) (4) 


where © corresponds here to Dempster-Shafer’s rule of com- 
bination and me(X) = 1. 

For solving this fusion problem under non-existential in- 
tegrity constraints, three methods are a priori possible based 
on DSCR: 

e The Fusion-Conditioning approach (FC): It consists to 
combine the sources at first and then apply Dempster-Shafer 
conditioning rule. This corresponds to the following formula: 


(5) 


where © corresponds here to Dempster-Shafer’s rule of com- 
bination and m,.,(Ox41) = 1. Note that m,;,(.) refers to the 
conditioning bba defined in 29%. 


mps-Fc(.|Oxn41) = [Mik D M2 k] D Me,K] (.) 


4if m(.) and me(.) are not in total contradiction of course. 
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e The Conditioning-Fusion approach (CF): It consists to 
apply the DS conditioning to the sources at first and then 
combine the conditioned bba’s with Dempster-Shafer rule. 
This corresponds to the following formula: 





Me,k](-) (6) 


e The Global Conditioning approach (GC): It consists to 
combine all the bba’s altogether in a single step of fusion. 
This corresponds to the following formula: 


mps-cr(.|On+1) = [Mik E Mek] 8 [M2,k 


(7) 


Because of the commutativity and associativity of DS rule 
and since [m. ® mMme](.) = mel.) for any conditioning bba 
focused on only one specific element X, the three previous 
methods provide exactly the same results. This makes Shafer’s 
approach very appealing since there is no ambiguity in the 
choice of the method to apply. 


[min D Man B Mex] (.) 


mps-ac(-|On41) 


B. Example 1 (continued) 


Let’s take back the Example | and consider the two arbitrary 
prior bba’s given in Table I. 



































| bba’s \focal a ĝi 02 63 0, U 02 
Prior: mı, a 0.2 0.4 0.3 0.1 
Prior: m2, k( 0.3 0.1 0.4 0.2 
| Conditioning: — kC.) 0 0 0 1 
DS-FC: mps- TÉL ) 0.4643 | 0.4643 0 0.0714 
DS-CF: mps-cr(.) 0.4643 | 0.4643 0.0714 
DS-GC: mps.ac(.) 0.4643 | 0.4643 0 0.0714 
Table I 
EXAMPLE |: RESULTS WITH DS-BASED CONDITIONING. 
Because in this example O, = {01,02,03} and 
NEk+1 {03} then Ox41 = {61,62} (only targets 6; 


and 0z survive) and therefore the conditioning bba m-,x(.) 
is defined by me p(01 U 02) 1. In applying DSCR, 
one gets with three methods the same following result: 
mps-ac(-lOk+1) = Mmps-cr(-|Ok+1) = Mps-rc(.|Ox41) 
as shown in the last three rows of Table I). This symmetrical 
result in 6; and 62 is very surprising since clearly the 
input bba’s are asymmetrical in 6; and 0) and we don’t 
see any intuitive nor rational justification to consider such 
DSCR-based behavior as efficient for applications. 


e Direct approach: Note that this result can be also simply 
obtained in a direct manner using DS rule for combining 
my1,%(.) with m2,(.) and in taking into account the constraint 
63; = Ú in the DS formula. In this example 1, one gets: 





my2(01) = m4,4(01)mM2,4(01) + 1,4 (01) mM2,~(01 U 02) 
+ mg k(01)Mı (01 U 02) = 0.13 

mi2(02) = Mi, k(02)M2,k (02) + Mı, k(02)M2,%(01 U 02) 
+ mM2,k(02)M1,k(01 U 02) = 0.13 





mı2(0ı U b2) = mı, k(O U 02)Mı (01 U 02) = 0.02 


For 03, one has m12(63) = ™1,~(03)m1,4(93) = 0.12. Since 
actually 03 = @, then mı2(03 = Ø) = 0.12 must be added to 
mass already committed to the empty set coming from other 
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possible conflicting conjunctions so that finally one will get 
the total conflicting mass m12(0) = 0.72. After normalization 
step, we finally get 


6 
mps(61) = aaa = 0.13/0.28 = 0.4643 
6 
mps(62) = pee = 0.13/0.28 = 0.4643 
6,U8 
mps(61 U 62) = meti = 0.02/0.28 = 0.0714 


e Advantages of DS approach: The main interest of this 
DSCR-based methods lies in the fact that DSCR can be 
interpreted as a generalization of Bayesian conditioning and 
that the conditioning and the DS fusion commute, so that 
the three methods FC, CF or GC based all on DSCR coincide. 


e Drawbacks of DS approach: Although attractive, DSCR ap- 
proach cannot however circumvent the problem inherent to DS 
rule itself when the sources to combine are highly conflicting 
or are in worst case in total conflict. Even if the sources are 
not too conflicting, DSCR can yield to questionable results as 
pointed out in Example 1 (i.e. symmetrical results based on 
asymmetrical inputs) — see Table I. 


C. Example 2 
This example is an extension of Zadeh’s example including 
non-existential constraint. Lets take © = {61,02,03, 04} 


satisfying Shafer’s model and the following prior bba’s given 
in Table II, and let’s assume at time k + 1 that we learn 
64 = Ú, so that Ox41 = {01,02,03}. Applying all previous 
methods, provide same counter-intuitive result mpg(@3) = 1 
as in classical Zadeh’s example. 






































bba’s\focal elem. Oy 62 63 04 61 U 02 U 63 
Prior: ™1,z (.) 0.98 0 0.01 | 0.01 0 
Prior: m2,x(.) 0 0.98 | 0.01 | 0.01 0 
Conditioning: Mme, k(.) 0 0 0 0 1 
DS-FC: mps-rc(.) 0 0 1 0 0 
DS-CF: mps.cr(.) 0 0 1 0 0 
DS-GC: mps.cc(.) 0 0 1 0 0 
Table II 


EXAMPLE 2-A: RESULTS WITH DS-BASED CONDITIONING. 


This example can be generalized as in Table III where all 
bba’s are normalized and the non-existential constraint is A4U 
..U A, = Ú. The result of DSCR approach is given in the 
right column of Table HI. 
for n > 1, where €1, €2, and 4;; are very tiny positive numbers 
in [0,1], aı and az are positive numbers closer to 1, but smaller 
than 1, and the sum on each column is 1; all intersections 
A;N Aj are empty, where A; can be singletons or unions of 
singletons. So, this is a Bayesian and non-Bayesian example. 


D. Example 3 


Here we give two very simple classes of examples with 
Bayesian or non-Bayesian bba’s where DSCR cannot be 
applied to solve the problem. We assume Shafer’s model for 
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a elem.\ bba’s | m,«(-) ee) Meret) ae Example I (continued): Let’s take back example 1 and exam- 
a . . 
a 0 a 0 0 ine the results given by PCR5-FC, PCR5-CF and PCR5-GC 
A3 e č 0 1 methods*. The results are given in Table VI. 
Ag 611 621 0 0 
. . . bba’s\ focal elem. 01 02 03 01 U02 | 
. . . : Prior: ™1,x (.) 0.2 0.4 0.3 0.1 
An Sin b2n 0 Prior: m2 4 (.) 0.3 0.1 0.4 0.2 
Ai U A2 U A3 0 0 1 0 Conditioning: me, 4 (.) 0 0 0 1 | 
A PCR5-FC: mpors-rc(.) | 0.2664 | 0.2927 | 0.3320 | 0.1089 
Bue noes FION a eos PCR5-CF: mpcrs.cr(.) | 0.3526 | 0.3822 | 0.0470 | 0.2182 
G i PCR5-GC: mpcrs-ac(.) | 0.1811 | 0.1975 | 0.1597 | 0.4617 
Table VI 


the frames. In example 3—A, the non-existential constraint is 
6; =) and the parameters a and b belong to [0, 1]. 
































bba’s\focal elem. 01 02 63 62 U 63 

Prior: ™1,x (.) a 0 l-a 0 

Prior: m2 p (.) b | 1-b | 0 0 

Conditioning: me,x(.) | 0 0 0 1 
Table IV 


BBA’S FOR EXAMPLE 3-A. (BAYESIAN CASE WITH 6; = 0 ) 


Example 3-A gives 0/0 when using Dempster-Shafer’s 
conditioning rule. 


In example 3—B, we consider non-Bayesian bba’s. The pa- 
rameters a and b belong to [0; 1]. The non-existential constraint 
is A, = Oo = 0. 





























bba’s\focal elem. 0, U 02 03 04 
Prior: ™1, x (.) a 0 l-a 
Prior: m2,x(.) b l-b | 0 
Conditioning: Mme, k(-) 0 0 1 
Table V 
BBA’S FOR EXAMPLE 3-B. (NON-BAYESIAN CASE WITH 61 U 62 = 0) 


An infinity of Bayesian or Non Bayesian classes with 
total conflicting sources can be constructed where DSCR rule 
cannot be applied. 


E. DSmT approach 


Since the PCR5 or PCR6 circumvent the problem of DS 
rule for combining potentially highly conflicting sources 
of evidence, it is natural to try at first to use the same 
methodology for solving the problem just in replacing the 
DS fusion operator @ by PCRS (or PCR6) fusion operators. 
This is called PCRSCR (PCRS5-based conditioning rule) or 
PCR6CR if one prefers to use PCR6. Unfortunately, the 
solution based on these PCR rules is not so simple because 
PCR rules are not associative and thus the result one gets 
highly depends on the conditioning method we adopt: FC, 
CF or Global. Moreover, the direct approach based on 
classical/original PCRS rule under non-existential constraint 
cannot be applied as it will be shown from Example 1. 
That’s why we propose a new solution to solve this important 
problem in the sequel. 


EXAMPLE 1: RESULTS WITH PCR5-BASED CONDITIONING. 


From Table VI, one sees clearly that the original PCR5 
rule used for solving this example generates different results 
depending the method (PCR5-FC, PCR5-CF or PCR5-GC) 
which is not very satisfactory, and that all methods commit 
a positive mass to 03 = Ø which is not acceptable since we 
assume to work within Shafer’s model in this example. 


Direct approach: If we now use a direct PCR5-based ap- 
proach for trying to solve the problem, we need to replace 
63 by Ú in the bba’s inputs and apply the PCR5g fusion rule 
proposed in [5]. PC R5g fusion formula is same as PC’R5 
fusion formula (3) except that X € D® where D® includes 
the empty set as well. In clear, PCR5g fusion rule allows Ø 
as focal element (as in Smets’ TBM). If we apply this PCR5g 
direct fusion, one will get results in Table VII consistent with 
the result of the last row of Table VI which is normal. 








[ bba’s\focal elem. A 0 0 0; U 82 | 
Prior: ™1,x(.) 0.2 0.4 0.3 0.1 
Prior: m2,4(.) 0.3 0.1 0.4 0.2 





| Conditioning: me, (.) 0 0 0 1 | 
[ mpors,Direct(.) | 0.1811 | 0.1975 | 0.1597 | 04617 | 




















Table VII 
BBA’S FOR EXAMPLE | AND PCR59-DIRECT RESULTS. 


Example 2 (continued): Let’s take back example 2 and exam- 
ine the results given by PCR5-FC, PCR5-CF and PCR5-GC 
methods. The results are given in Table VIII (rounded when 
possible at the fourth decimal). 











bba’s\focal elem. 01 02 03 04 =9 01 U 09 U 63 
Prior: m1 pC) 0.98 0 0.01 0.01 0 
Prior: mo p(-) 0 0.98 0.01 0.01 0 
Conditioning: mo g C) 0 0 0 0 I 





0.49960202 
0.49970100 
0.32762253 


0.49960202 
0.49970100 
0.32762253 


0.00039798 
0.00049796 
0.00020045 


0.00000016 
0.00000007 
0.00010047 


0.00039782 
0.00009997 
0.34445402 


MPOR5-FO(-) 
MPOR5-CF(-) 
m pPcRs-GC(-) 























Table VIII 
EXAMPLE 2-A: RESULTS WITH PCR5-FC, PCR5-CF & PCR5-GC. 


Example 3 (continued): Let’s take back example 3—A with 
a = b = 0.9 and 1 — a = 1 — b = 0.1. The results given by 
PCR5-FC, PCR5-CF and PCR5-GC are given in Table IX. 


5i.e. FC, CF and GC approaches when using PCRS rule of combination 
instead of DS rule. 
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bba’s\ focal elem. 0 =0 02 63 02 U 63 
Prior: m1, (.) 0.9 0 0.1 0 
Prior: m2_x(.) 0.9 0.1 0 0 
Conditioning: Me x (.) 0 0 0 1 
MPCRS FCC) 0.4791 | 0.0140 | 0.0140 | 0.4929 
meEchean) 0.4421 | 0.0605 | 0.0605 | 0.4369 
mipaks-dal) 0.4435 | 0.0053 | 0.0053 | 0.5458 
Table IX 


EXAMPLE 3—A: RESULTS WITH PCR5-FC, PCR5-CF & PCR5-GC. 


In summary, one has shown from very simple examples 
that original PCR5-based approaches cannot be used directly 
to solve the problem because they generate a non normalized 
bba (i.e. a bba with a positive value committed to Ø) and 
moreover the result depends the choice of the methods because 
of non associativity of PCR5 (or PCR6 as well). It is worth 
to note however that the results provided by the PCR5- 
based approaches commit different masses on non-empty focal 
lements contrariwise to DS-based approaches. In the next 
section we present new approaches for trying to solve the 
problem. 


IV. EXTENDED PCR RULES 


In this section we propose several ways to deal with the 
fusion of sources under non-existential integrity constraints 
since original PCRS (or PCR6) cannot be applied directly. 
This is the main reason why new solutions have to be found 
and this is the main contribution of this paper. 


A. Simple solution based on normalization 


A simple solution would consist to use original PCR5 
or direct PCR5g rules with a normalization final step (not 
included in original formulas) consisting in dividing all the 
mass of non-empty focal elements by (1—m/(@)). This method 
can be applied only when m(@) < 1 of course. In example 1, 
one will get results given in Table X. 



































bba’s \ focal elem. 01 02 63 6, U b2 
Prior: m1,4(.) 0.2 0.4 0.3 0.1 
Prior: mə (.) 0.3 0.1 0.4 0.2 
Conditioning: Me% (.) 0 0 0 1 
Normalized PCR5-FC bba | 0.3988 | 0.4382 0 0.1630 
Normalized PCR5-CF bba | 0.3700 | 0.4010 0 0.2290 
Normalized PCR5-GC bba | 0.2155 | 0.2350 0 0.5495 
Normalized PCR5g bba 0.2155 | 0.2350 0 0.5495 
Normalized PCR6-GC bba | 0.2133 | 0.2326 0 0.5541 
Normalized PCR6g bba 0.2133 | 0.2326 0 0.5541 
Table X 
BBA’S FOR EXAMPLE | AND PCRSCR-BASED RESULTS AFTER 
NORMALIZATION. 


Note that another result can be obtained from PCR5 and CF 
approach if one first normalizes the bba’s m?'C°(.|0, U62) = 
Mik OMex(.), and mZCR5 (10, U 02) = man © Mex.) and 
then if we apply original PCR5 formula to combine them. We 
denote this method as PCR5-CnF (n standing for the position 
where the normalization step is done). In this case, one will 
get: Mpcrs-CnF (01) = 0.391, MPcrs-CnF (92) = 0.414 and 


Mpcors-CnF(91 U 02) = 0.195 which is still different from 
previous results. 

As one sees, all methods including a normalization step 
provide now different results and all agree that 02 corresponds 
to the hypothesis that has highest belief or plausibility. There 
is no ambiguity in the choice between 0; and 02 contrariwise 
to DS approach. The least uncertainty level is obtained with 
PC'R5 — FCn approach in this example. 


B. A more efficient solution 

Here we propose another way to solve the problem us- 
ing new extended PCRS fusion formulas denoted PC R5a, 
PC'R5b and PC R5c. 

e The PCRSa fusion rule: mpcrs5a(0) = 0 and VA € G°\0 


= my42(A)+ 
m(A)?m2(X) m2(A)?m i(X) 
= (A) + m2(X) a mə(A) + mı(X) 


XEGE 
XNA=0 


+X [m (A) 


Xe 


MPCR5a(A) 





] 


X) + m2(A)mı (X)| 


X x yeg ™1(X)ma(Y) 
2 zecog ™2(Z) 
In PCR5a rule, one transfers the remaining conflicting 

masses proportionally with respect to the non-null masses 


resulted from the conjunctive rule. 
e The PCR5b fusion rule: mpcrsy(0) = 0 andVA € G°\O 





+ my42(A) - (8) 


mpcrso(A) = m12(A)+ 





m4(A)?m2(X) mg(A)?m1(X) 
5 ol ] 
xee*\0 m,(A) + m2(X) ma(A)+mı(X) 
XNA=0 
+ XO [m (A)m2(X) + m2(A)mi(X)] 
Xe 


Xx yeo m(X)m(Y) 
t Card Z]Z Ee G8 \ 0, m42(Z ) x 0}) 


In PCR5b rule, one uniformly transfers the remaining conflict- 
ing masses to all non-null masses resulted from the conjunctive 
rule. 

e The PCR5c fusion rule: mporsc(0) = 0 andVA € GO \ 0 





(9) 


Mpcrsc(A) = mı2(A)+ 

m (A)? maX)  ma(4) m X) 
— FmatX) * mal) $ mK) 
XNA=0 


+ S°[mi(A) 


Xe 





X) + m2(A)mı (X)| 


Te as 


X,Y c0, A=Ã 


In PCR5c rule, one transfers all remaining conflicting masses 
to the total ignorance J;. 
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For PCR5a—PCR5Sc formulas (8)—(10) (and the next DSmHa— 
DSmHc, DSa—DSc formulas too) if a denominator is equal 
to zero, then its respective fraction is discarded, and 
X x yeg ™1(X)ma(Y) is transferred to the total ignorance. In 
this case all three PCRSa-—c coincide. Similarly, all DSmHa—c 
coincide, and all DSa—c coincide as well. In the above formu- 
las, ™12(A) is the mass obtained by the classical conjunctive 
consensus obtained by 


my42(A) = y 
X1,X2€G° 
XıNX2 =A 


mı(Xı)mə(X2) (11) 


G® is the fusion space (power-set, hyper-power set or super- 
power set) depending on the underlying model chosen for the 
frame © and @ is the set of all empty sets that occur in the 
fusion due to the integrity constraints. 

Remarks: 


1) If no constraint occurs (i.e. no focal element becoming 
empty), then all PCR5a—PCR5c formulas coincide with 
classical PCR5 fusion rule. All these extended PCRS 
rules can be extended for combining N > 2 sources of 
evidences. 

If all information about m,(.) and mo(.) and constraints 

(the sets which become empty in the fusion space) come 

simultaneously, we can use any of these three formulas. 

3) PCR5a formula is the best. PCR5a and PCR5b for- 
mulas keep the specificity resulted after applying the 
conjunctive rule. PCR5c rule is less specific (and not 
recommended). 

4) These formulas can be modified easily into PCR6a— 
PCR6c formulas by applying PCR6 redistribution prin- 
ciple to m;(.) and mə2(.) and transferring the remaining 
mass committed to empty set as in PCRSa—PCR5Sc 
formulas. 

5) In the case when the information comes sequentially, we 
combine it in that order. 


PCRSa is better than PCR5b and PCR5c because PCR5a 
is more specific than both of them. Its bigger specificity is 
due to the fact that all masses of degenerated intersections 
mi2(A N B), where A = B = Q, are redistributed 
proportionally to all non-empty elements resulted from the 
conjunctive rule. While PCR5c redistributes this whole 
degenerated mass to the total ignorance (hence the lowest 
specificity among this group of three related formulas), and 
PCR5b uniformly splits this whole degenerated mass to all 
non-empty elements (but this means that PCR5b gives the 
same amount to each non-empty element, while PCRa gives 
more generated mass to the elements which have a bigger 
mass from the conjunctive rule). 


2 


wm 


Except Smets’ fusion rule in TBM, we can adapt many 
fusion rules which are based on the conjunctive rule, including 
PCR6 too of course. We can adapt in three ways, correspond- 
ing to the previous PCR5a—PCRSc improved rules, replacing 
only the PCR5 first summation in all three formulas with 


DSmH summation S2 [4], Vol.1. For example, the DSmHa, 
DSmHa and DSmHc extended rules are given by: 
e DSmHa fusion rule: mpsmHa(0) = 0 and VA € GE \ Ø 


MDsmHalA) =m(A)+ XO mi(X)m2(Y) 
xeG?\o 


XNY=6 
XUY=A 


+ J [ra (A)ma(X) + ma(A)mi (X)] 
XEO 
X xy mı(X)m2(Y) 
2 zeceg m2(2) 





e DSmHb fusion rule: mpsmrrb(0) = 0 and VA € GĦ \ Ø 


mpsmml A) = m(A4)+ XO m(X)m(Y) 


xEeG?\o 
+ S©[mi(A)m2(X) + m2(A)mi(X)] 
XEO 
Xx yc mı(X)m2(Y) 





+ Gard {ZZ € G9 \ 0, ma(Z) 4 0) eea 


e DSmHc fusion rule: mpgm#e(0) = 0 and YA € GĦ \ Ø 


MpsmHc(A) = my2(A) F ) my (X)m2(Y) 
xec?\o 
XnNY=0 
XUY=A 


+ S©[mi(A)m2(X) + m2(A)mi(X)] 


Xe 
7 oe 


X,YE0,A=I; 


m(X)m2(¥Y) (4) 


DSmH classic rule [4] (Vol.1) redistributes the whole 
conflicting mass of the form mı2(AN B), with A = B = 9, 
resulted from the conjunctive rule, to the total ignorance; 
DSmH classic is equivalent (gives the same result) as 
DSmHc. But DSmHa and DSmHb are more specific than 
DSmHc (=DSmH classic) from exactly the same reason as 
explained before regarding the more specificity of PCR5a 
with respect to PCR5 and PCR5b. DSmHa is the most 
specific among all three DSmHa-DSmHc. Thats why 
we need DSmHa. In addition, in the three formulas of 
DSmHa—DSmHc we can condensed the first two summations 
(m42(A)+3>...4+55...+...) into one summation only, i.e 
under the first summation we can write X,Y € G®Ì (so X, Y 
can be empty as well) and the second summation disappears 
(it is absorbed by the first). 


Similarly for Dempster-Shafer’s extended rule in the DSm 
way, we replace in all first three formulas the first PCR5 
summation by 


oS X,YEO mı(X)m(Y) 
XNY=0 


izece\o ™12(Z) 





M12 (A) z 
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e DSa fusion rule: mpsa(0) = 0 and VA € G° \ Ø 


` X,Y c0 mı(X)m2(Y) 


mpga(A) = mia(4) + mi2(A) - —*O¥=2 








X zece mı2(Z) 
+ X [m (A)m(X) + m2(A)mi(X)] 
Xec 
, Uxyeo mi (X )me2(Y) 
+ M12 (A) X zeco mı2(Z) 15) 


e DSb fusion rule: mpsy(0) = 0 and VA € G® \ Ø 


> X,YEO m1(X)m2(Y) 


mpgp(A) = m12(A) + mi(A) - —XO¥=2 





dezeae\o ™2(Z) 
+ So [mi(A)m2(X) + m(A)mı (X)] 
XEO 
Uxyeo mi (X)m2(Y) 





t Gard({Z|Z € G®\ 0, mia(Z) 4 0}) 


e DSc fusion rule: mps-(0) = 0 and VA € GE \ Ø 


X xeo m(X)m(Y) 





‘2 men ns ___xny=0 
pse(A) RR Ss, oa gn) 
+ Do [ra (A)na(X) + m(4)m (X)] 
XEO 
X,YE0O,A=h 


Note that all these extended fusion rules are however not 
associative and therefore if one has several sources available 
at a given time to combine, the combination must be applied 
with all sources together to get optimal fusion result. 


C. Example 1 (continued) 


Let’s examine in details the results obtained on Example 
1 with all these extended fusion formulas. Because 03 = Ø 
and Shafer’s model is assumed for Ox, the set of elements 
becoming empty is Ø = {61 N 62, 01 N 03, 02 N 83, (01 U 02) A 
03, 03} and one has: mMı2(81N02) = 0.14, mı2(01N03) = 0.17, 
mMı2(02 N 03) = 0.19, mı2( (01 U2) N 03) = 0.10, mı2(03) = 
0.12. Mm12(01 N 62 € Ø) = 0.14 is redistributed back to 6, and 
0> using PCRS principle: 
X16, Y162 0.02 0.2 L261 Y205 0.12 1.2 
0.2 0.1 0.3 3° 0.3 0.4 0.7 7 


0.2 1.2 
T19, = 0.2 ~ 0.013 z2, = 0.3 ~ 0.051 




















0.2 1.2 
Yio, = 0.1 ~ 0.007 y29, = 0.4 =~ 0.069 


mı2(01 163 € Ø) = 0.17 is all redistributed back to 6; since 
63 = Ú (non-existential constraint). m12(62 N 63 € Ø) = 0.19 
is all redistributed back to 02 since 63 = Ø (non-existential 
constraint). m12((01U02)N03) € Ø) = 0.10 is all redistributed 
back to 0, U 62 since 63; = Ø (non-existential constraint). 
While m12(83 = Ø) = 0.12 is redistributed differently in each 
PCRS5a, PCR5b and PCRSc formulas: 


1) In PCRSa: 


LO, — Yaz _ 20,U02 _ 0.12 _ 3 


0.13 0.13 #4002 0.28 7 


whence x9, = yo, = 0.13 - 3/7 ~ 0.056 and z9,U9e, = 
0.02 - 3/7 ~ 0.008. 
2) In PCRS5b: ZO, YO 201, U0 
3) In PCRSe: zo uo, = 0.12. 


Finally, one then gets results shown in the Table XI. From 
these results, one sees that PCR5a rules provides the most 
specific result since the mass committed to the uncertainty 
is lowest with respect to what we get with PCR5b, PCR5c 
and other PCR5-based normalized conditioning rules given in 
the Table X. PCR5b is also a bit better (more specific) than 
PCRS5-based normalized conditioning rules also. As we see 
and as expected from the theory PCR5Sc is less specific than 
PCRS5a and PCRSb. If we use DSmHa-DSmHc fusion rules 
on this example, m12(81 N 42 € Ø) = 0.14 is all redistributed 
back to 6; U 62 using DSmH principle [4], Vol.1. The other 
conflicting masses are redistributed respectively in the same 
way in PCR5a—PCR5Sc rules. The same example for Dempster- 
Shafer’s rule extended in DSm style: m12(6,N62 € Ø) = 0.14 
is all redistributed back to 01, 62, and 0; U 02 since they are 
non-empty proportionally with respect to their conjunctive rule 
masses 0.13, 0.13 and respectively 0.02: 


LO, Yoz £01U02 0.14 
= = 0.5 
0.13 


0.13 0.02 0.28 — 

whence zọ, = Yo, = 0.13(0.5) = 0.065 and z9,u9, = 
0.02(0.5) = 0.010. The other conflicting masses are redis- 
tributed respectively in the same way as in PCR5a-PCR5c 
rules. The results obtained with DSmHa-DSmHc and DSa- 
DSc rules are given in Table XI. In this example, one sees that 
PCRSa is the most specific rule and in all cases, the rational 
decision to take will be 02 without ambiguity contrariwise to 
DSCR approach. 











0.12/3 = 0.04. 









































bba’s \ focal elem. 01 05 63 =O | 6, UA 
Prior: ™1,x (.) 0.2 0.4 0.3 0.1 

Prior: m2,x(.) 0.3 0.1 0.4 0.2 

MPCRBa 0.420 | 0.452 0 0.128 
MPCR5b 0.404 0.436 0 0.160 
PO REE 0.364 | 0.396 0 0.240 
WD sa 0421 | 0441 0 0.138 
MDSb 0.405 0.425 0 0.170 
MpDSe 0.365 | 0.385 0 0.250 
™MpSmHa 0.356 | 0.376 0 0.268 
MDSmHb 0.340 0.360 0 0.300 
MDSmHe 0.300 0.320 0 0.380 

Table XI 


EXAMPLE 1: PCR5A-c & DSA-c & DSMHA-C RESULTS. 


V. EXAMPLES 


Here we present the solution of Examples 2-A, 3A-—3B 
obtained with our new extended PCR5a—PCRSc rules of com- 
bination for solving the fusion of bba’s under non-existential 
constraints in degenerate cases. 
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A. Example 2 (continued) 


Let’s consider the Example 2—A and apply PCR5a—PCR5c 
formulas. Using the PCRS principle, m 42(6; N 02) = 0.98 - 
0.98 = 0.9604 is redistributed back to 6; and 62 with the same 
proportions xg, = x9, = 0.4802; m12(01N63) = 0.98-0.01 = 
0.0098 is redistributed to 0ı and 63 with xg, = 0.00970101 
and zo, = 0.00009899; ™m12(62M63) = 0.0098 is redistributed 
to 2 and 03 with x», = 0.00970101 and x», = 0.00009899; 
m42(6; N 84) = 0.0098 is transferred to 6, only since 
04 = Ø; mi2(82 N 64) = 0.0098 is transferred to 62 only 
since 64 = 0; m12(03 64) = 0.0002 is transferred to 63 only 
since 04 = 0); Since only mı2(03) 4 0 with 63 4 Ø the mass 
m 42(64) = 0.0001 is transferred to 63 in both PCR5a and 
PCR5b formulas. But in PCRSc rule, ™m12(@4) is transferred 
to the total ignorance I; = 0, U 02 U 03. The final results 
obtained with PCR5a, PCR5b (same as with PCRSa for this 
example) and PCR5c are given in Table XII below. 





























focal el.\bba’s | mi,x | M2,k | MPCR5a,b(-) MPCR5c(.) 

A 0.98 0 0.49970101 0.49970101 

b2 0 0.98 0.49970101 0.49970101 

63 0.01 0.01 | 5.9798-10-* | 4.9798 - 1074 

i4,=0 0.01 | 0.01 0 0 

61 U 62 U 43 0 0 0 0.0001 
Table XII 


EXAMPLE 2-A: RESULTS WITH PCRS5A-C 


B. Example 3 (continued) 

In Example 3-A, 0; becomes empty and therefore: m12(01N 
02) = a(1 — b) goes to 02, Mm12(01 N 03) = b(1 — c) goes to 
0z and m12(02M 63) = 1 — a — b + ab is split between 02 and 
03 proportionally to 1 — b and 1 — a respectively: 

Tos _ Tos _ 1—a—b+ab 
1-b l-a  2-a-—b 


Therefore, one gets finally 





_ 1-a-—2b+ab+ 0? — ab? 
7 2—a-—b 





LO 


_ 1-2a— b+ 2ab + a? — ab 
ot 2—a—b 
Since 0; = Ø, mı2(01) = ab is redistributed to 62 U 43 in 
PCR5a-PCR5c formulas because all m12(X) = 0 for X #0. 


The final results are given in Table XIII depending on the 
values of parameters a and b 





To 























Cases a#l1,bAl a=b=1 | 
focal elem. \ bba’s MPCR5a,b,c(-) MPCR5a,b,cl-) | 
01 0 0 
02 a(1 = b) + CEST 0 
03 b(1—a) + Coa 0 
02 U 03 ab 1 











Table XIII 
EXAMPLE 3-A: RESULTS WITH PCRSA—PCRS5C 


Extended PCRS rules for Example 3-B give same results as 
for Example 3—A, where we replace 0; by 6; U @2, 62 by 03, 
and 63 by 64, and 02 U 63 by 03 U 04. If we take by example, 
a = b = 0.9 and 1 — a = 1 — b = 0.1 in examples 3—A and 
3-B then we will finally obtain for Examples 3-A & 3-B: 





























bba’s\focal elem. | 61 62 03 02 U 03 

MPCR5a—c(-) 0 0.095 | 0.095 0.810 

MpSmHa—cl- 0 0.090 | 0.090 0.820 

MpSsa—c(.) 0 0.090 | 0.090 0.820 
Table XIV 


EXAMPLE 3-A: RESULTS WITH a = b= 0.9 AND 1 —a = 1 —b = 0.1 























| bba’s\focal elem. | 01 U 02 63 04 63 U 04 | 
WMP Geo) 0 0.095 | 0.095 | 0.810 
mo IES 0 0.090 | 0.090 | 0.820 
mpSa—c(.) 0 0.090 | 0.090 0.820 
Table XV 


EXAMPLE 3-B: RESULTS WITH a = b= 0.9 AND 1— a = 1 —b = 0.1 


Dempster-Shafer’s rule cannot be applied in these examples 
since it gives 0/0. 


VI. CONCLUSIONS 


In this paper we extend the classical PCR5 and DSmH 
combination fusion rules to two ensembles of new fusion rule 
formulas, PCR5a—PCR5c and respectively DSmHa—DSmHc, 
in order to be able to take into consideration the non- 
existence constraints (i.e. when some sets become empty) that 
may occur during a dynamic fusion. Further, we show that 
the same DSmT extension procedure applied to PCR5 and 
DSmH can be applied to Dempster’s rule and other rules as 
well. We provide several examples with these PCR5a—PCR5c 
and DSmHa-DSmHc rules, and also with Dempster-Shafer 
conditioning rule (DSCR). We have presented some classes of 
counter-examples to DSCR. If we have two sources, what to do 
first Fusion and then Conditioning, or Conditioning and then 
Fusion? A simple answer would be to do them in the order we 
receive the information. But in the case we receive all of them 
simultaneously, it is better to use these new extended rules 
depending on the specificity quality we want to get, PCR5a 
being the most specific rule. 


REFERENCES 


1] J. Dezert, F. Smarandache, A new probabilistic transformation of belief 
mas assignment, in Proc. of Fusion 2008, Cologne, Germany, July 2008. 
(available from http://fs.gallup.unm.edu//DSmT.htm). 

2] J. Dezert, F. Smarandache, Non Bayesian conditioning and decondition- 
ing, International Workshop on Belief Functions, Brest, France, April 
2010. http://bfas.iutlan.univ-rennes | .fr/belief2010/ 

3] G.Shafer, A mathematical theory of evidence, Princeton Univ. Press, 1976. 

4] F. Smarandache, J. Dezert (Editors), Advances and Applications of DSmT 
for Information Fusion, American Research Press, Rehoboth, Vol.1-3, 
2004-2009. (available from http://fs.gallup.unm.edu//DSmT.htm) 

5] F. Smarandache, J. Dezert, J.-M. Tacnet, Fusion of sources of evidence 
with different importances and reliabilities, in Proceedings of Fusion 2010 
conference, Edinburgh, UK, July 2010. 

6] P. Smets, Constructing the pignistic probability function in a context of 
uncertainty, Uncertainty in Artificial Intelligence, Vol. 5, pp. 29-39, 1990. 

7] http://bfas.iutlan.univ-rennes 1 .fr/wiki/index.php/Toolboxs 





146 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


A Fuzzy-Cautious OWA Approach with 
Evidential Reasoning 


Degiang Han 
Jean Dezert 
Jean-Marc Tacnet 
Chongzhao Han 


Abstract—Multi-criteria decision making (MCDM) is to make 
decisions in the presence of multiple criteria. To make a decision 
in the framework of MCDM under uncertainty, a novel fuzzy - 
Cautious OWA with evidential reasoning (FCOWA-ER) approach 
is proposed in this paper. Payoff matrix and belief functions of 
states of nature are used to generate the expected payoffs, based 
on which, two Fuzzy Membership Functions (FMFs) representing 
optimistic and pessimistic attitude, respectively can be obtained. 
Two basic belief assignments (bba’s) are then generated from 
the two FMFs. By evidence combination, a combined bba is 
obtained, which can be used to make the decision. There is no 
problem of weights selection in FCOWA-ER as in traditional 
OWA. When compared with other evidential reasoning-based 
OWA approaches such as COWA-ER, FCOWA-ER has lower 
computational cost and clearer physical meaning. Some experi- 
ments and related analyses are provided to justify our proposed 
FCOWA-ER. 

Index Terms—Evidence theory, OWA, belief function, uncer- 
tainty, decision making, information fusion. 


I. INTRODUCTION 


In real-life situations, decision making always encounters 
difficult multi-criteria problems [1]. In classical Multi-Criteria 
Decision Making (MCDM) framework, the ordered weighted 
averaging (OWA) approach proposed by Yager [2] has been 
increasingly used in wide range of successful applications for 
the aggregation of decision making problems such as image 
processing, fuzzy control, market prediction and expert sys- 
tems, etc [3]. OWA is a generalized mean operator providing 
flexibility in the aggregation. Thus the aggregation can be 
bounded between minimum and maximum operators. This 
flexibility of the OWA operator is implemented by using the 
concept of orness (optimism) [4], which is a surrogate for 
decision maker’s attitude. One important issue in the OWA 
aggregation is the determination of the associated weights. 
Many approaches [5]-[10] have been proposed to determine 
the weights in OWA. See the related references for details. 

In multi-criteria decision making, decisions are often made 
under uncertainty, which are provided by several more or less 
reliable sources and depend on the states of the world: deci- 
sions can be taken in certain, risky or uncertain environment. 
To implement the decision making under uncertainty, many 
approaches were proposed including DS-AHP [11], DSmT- 
AHP [12] and ER-MCDA [13], etc. Especially for the OWA 
under uncertainty, Yager proposed an OWA approach with 


Originally published as Han D., Dezert J., Tacnet J.-M., Han C., A Fuzzy- 
Cautious OWA Approach with Evidential Reasoning, in Proc. Of Fusion 2012, 
Singapore, July 2012, and reprinted with permission. 


evidence reasoning [14]. In our previous work, a cautious 
OWA with evidential reasoning (COWA-ER) was proposed to 
take into account the imperfect evaluations of the alternatives 
and the unknown beliefs about groups of the possible states 
of the world. COWA-ER mixes MCDM principles, decision 
under uncertainty principles and evidential reasoning. There 
is no step of weights selection in COWA-ER, which is good 
for the practical use. Recently, we find that there also exists 
drawbacks in COWA-ER. More precisely, the computational 
cost of the combination of different evidences by COWA- ER 
highly depends on the number of alternatives we encounter 
in decision making. When the number of alternatives is large, 
the computational cost will increase significantly. 

In this paper, we propose a modified COWA-ER ap- 
proach, called Fuzzy-Cautious OWA with Evidential Rea- 
soning (FCOWA-ER), by using a different way to manage 
the uncertainty caused by weights selection. Payoff matrix 
together with the belief structure (knowledge of the states 
of the nature) are used to generate two Fuzzy Membership 
Functions (FMFs) representing the optimistic and pessimistic 
attitude, respectively. Then two bba’s can be obtained based 
on the two FMP’s by using a-cut approach. Based on evidence 
combination, the combined bba can be obtained and the final 
decision can be made. The FCOWA-ER approach doesn’t 
need a (ad-hoc) selection of weights as in the traditional 
OWA. When compared with COWA-ER, FCOWA-ER has less 
computational cost and clearer physical meaning because it 
requires only one combination operation regardless of the 
number of alternatives. The proposed FCOWA-ER can be 
seen as a trade-off between the optimistic and the pessimistic 
attitudes. The preference of the two attitudes can be adjusted 
by the users using discounting factors in the combination 
of evidences. Some experiments and related analyses are 
provided to show the rationality and efficiency of this new 
FCOWA-ER approach. 


II. MULTI-CRITERIA DECISION MAKING UNDER 
UNCERTAINTY 


Multi-criteria decision making (MCDM) refers to making 
decisions in the presence of multiple, usually conflicting or 
discordant, criteria. Consider the following matrix C provided 
to a decision maker: 
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Ga ass S; a 
Ay Cu Cij Cin 
Ai Ci Cij Cin =C 
Aa Gai Cii Can 


In the above each A; corresponds to a possible alternative 
available to the decision maker. Each S; corresponds to a 
possible value of the variable called the state of nature. C;; 
corresponds to the payoff to be received by the decision maker 
if he selects action A; and the state of nature is S;. The 
problem encountered by the decision maker in MCDM is to 
select the action which gives him the optimum payoff. 

Among all the available MCDM approaches, Ordered 
Weighted Averaging (OWA) is a very important one, which 
is introduced below. 


A. Ordered Weighted Averaging (OWA) 


OWA was proposed by Yager in [2]. An OWA operator of 
dimension n is a function F : R” — R that has associated 


with a weighting vector! W = [w1, W2,..., Wn]? such that 
w; € [0,1] and X; } w; = 1. For any set of values a1, ..., dn 
F(a, ..., an) = ts (wi - bi) (1) 


where b; is the ith largest element in the collection a1, ..., an. 
It should be noted that the weights in the OWA operator are 
associated with a position in the ordered arguments rather than 
a particular argument. 

The OWA operator depends on the associated weights, 
hence the weights determination is very crucial. Some com- 
monly used weights selection strategies are as follows [14]: 

1) Pessimistic Attitude: If W = (0,0,...,1]”, then 


F (a1, G2, ...,dn) = min, [a]. 
2) Optimistic Attitude: If W = [1,0,...,0]", then 


F (a1, G2, ..., an) = max; [aj]. 


3) Hurwicz Strategy: If W = [a,0,...,1—a]", then 
F'(a1, G2, ...,@n) = a+ max,|a;] + (1 — a) - min{a,]. 


4) Normative Strategy: If W = [1/n,1/n,...,1/n]", then 
F(a, 2, +++) an) 5 (1/n) 7 ae Qi. 

The OWA operator can be seen as the decision-making 
under ignorance, because in classical OWA, there is no knowl- 
edge about the true state of the nature but that it belongs to a 
finite set. It should be noted that the pessimistic and optimistic 
strategies provide limited classes of OWA operators. There 
also exist other strategies to determine the weights, e.g., the 
weights generation based on entropy maximization. See related 
references [5]-[10] for details. 


' where X is a vector or a matrix and XT denotes the transpose of X. 


Based on such OWA operators, for each alternative 


Ai i = 1,...,q, we can choose a weighting vector 
W; = [wi1, wi2,... Win] and compute its OWA value V; £ 
F(Ci, Cia, Ex .;Cin) = sy, Wiz * bij where bij is the jth 


largest element in the collection of payoffs Ci, Ci2,..., Cin. 
Then, as for decision-making under ignorance, we choose 
A* = Aj» with i* £ arg max;{Vj}. 


B. Uncertainty in MCDM context 


Decisions are often made based on imperfect information 
and knowledge (imprecise, uncertain, incomplete) provided 
by several more or less reliable sources and depend on the 
states of the world: decisions can be taken in certain, risky or 
uncertain environment [15]. In a MCDM context, the decision 
under uncertainty means that the evaluations of the alternative 
are dependent on the state of the world. 

Introducing the ignorance and the uncertainty in a MCDM 
process consists in considering that consequences of alterna- 
tives (A;) depend on the state of nature represented by a finite 
set S = {S1, S2, ..., Sn}. For each state, the MCDM method 
provides an evaluation C;;. We assume that this evaluation 
Ci; done by the decision maker corresponds to the choice of 
A; E€ {A1,...,Ag} when S; occurs with a given (possibly 
subjective) probability. The evaluation matrix is defined as 
C = [C;,;] where i = 1,...,q and j = 1,...,n. 

Since the payoff to the decision maker depends upon the 
state of nature, his procedure for selecting the best alternative 
depends upon the type of knowledge he has about the state of 
nature. For representing the uncertainty for the state of nature, 
the belief functions introduced in Dempster-Shafer Theory 
(DST) [16] (known also as the Evidence Theory) can be used. 
This is briefly introduced below. 


C. Basics of Evidence Theory 

In DST, the elements in the frame of discernment (FOD) 
denoted by © are mutually exclusive and exhaustive. Suppose 
29 denotes the powerset of FOD. One defines the function 
m : 2° — [0,1] as the basic belief assignment (bba, also 
called mass function) if it satisfies: 


ace MA) = 1, m0) =0 (2) 
The belief function (Bel) and the plausibility function (Pl) 


are defined below, respectively: 


Bel(A) = Zea m(B) (3) 
PU(A) =} inez ™B) (4) 


Let us consider two bba’s m:(.) and ma(.) defined over the 
FOD ©. Their corresponding focal elements” are Aj,..., Ax 
and Bj,..., Bı. If k= ANB; =0 my (A;)me2(B;) < 1, the 
function m : 2° — [0,1] denoted by 
0, A=0 
X 
AinBj=A 


1- m1(Ai)m2(B;)’ 
A, NBj=0 


m1 (A; )m2(B;) 


AZO 2 








2a focal element X of a bba m(.) is an element of the power set of the 
FOD such that m(X) > 0. 
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is also a bba. The rule defined in Eq. (5) is called Dempster’s 
rule of combination. 


D. MCDM with belief structures 


Yager proposed an approach for decision making with belief 
structures [14]. One considers a collection of q alternatives 
belonging to A = {Aj, Ao,..., Aq} and a finite set S = 
{S1,S2,..., Sn} of states of the nature. We assume that the 
payoff/gain C;; of the decision maker in choosing A; when 
S; occurs are given by positive (or null) numbers. The payoffs 
q x n matrix is defined by C = [C;;] where i = 1,...,q 
and j = l,...,n as in eq. (2). The decision-making problem 
consists in choosing the alternative A* € A which maximizes 
the payoff to the decision maker given the knowledge on the 
state of the nature and the payoffs matrix C. A* € A is called 
the best alternative or the solution (if any) of the decision- 
making problem. 

In Yager’s approach, the knowledge on the state of the 
nature is characterized by a belief structure. Clearly, one 
assumes that a priori knowledge on the frame S of the different 
states of the nature is given by a bba m/(.) : 25 — [0,1]. 
Decision under certainty is characterized by m(S;) = 1; 
Decision under risk is characterized by m(S;) > 0 for some 
states Sj € S; Decision under full ignorance is characterized 
by m(S1 U S2 U... U Sn) = 1, etc. Yager’s OWA for decision 
making under uncertainty combines the schemes used for 
decision making under risk and ignorance. It is based on the 
derivation of a generalized expected value C; of payoff for 
each alternative A; as follows: 


Ci =O (Xn )Vie (6) 


where r is the number of focal elements of the belief structure. 
m(Xp) is the mass assignment of the focal element Xy € 2°. 
Viz is the payoff we get when we select alternative A; and 
the state of nature lies in X;. The derivation of V; is done 
similarly as for the decision making under ignorance (i.e., the 
procedure of OWA) when restricting the states of the nature 
to the subset of states belonging to X% only. One can choose 
different strategies to determinate the weights. Actually, C; is 
essentially the expected value of the payoffs under A;. Select 
the alternative with highest C; as the optimal one. 


E. Cautious OWA with Evidential Reasoning 


Yager’s OWA approach is based on the choice of a given 
attitude measured by an optimistic index in [0, 1] to get the 
weighting vector W. How to choose such an index/attitude? 
This choice is ad-hoc and very disputable for users. In our 
previous work [15] we have only considered jointly the two 
extreme attitudes (pessimistic and optimistic ones) jointly and 
developed a method called Cautious OWA with Evidential 
Reasoning (COWA-ER) for decision under uncertainty based 
on the imprecise evaluation of alternatives. 

In COWA-ER, the pessimistic and optimistic OWA are used 
respectively to construct the intervals of expected payoffs for 
different alternatives. For example, if there exist q alternatives, 


the expected payoffs are as follows. 


EICy (omin, omy 

E[C2] lcp"", Cye*] 
E[C] = l 7 : 

E[Ca] [oma ome] 


Therefore, one has q sources of information about the 
parameter associated with the best alternative to choose. 
For decision making under imprecision, the belief functions 
framework is used again. COWA-ER includes four steps: 


e Step 1: normalization of imprecise values in [0, 1]; 

e Step 2: conversion of each normalized imprecise value 
into elementary bba m;(.); 

e Step 3: fusion of bba m;(.) with some combination rule; 

e Step 4: choice of the final decision based on the resulting 
combined bba. 


In step 2, we convert each imprecise value into its bba 
according to a very natural and simple transformation [17]. 
Here, we need to consider the finite set of alternatives 
© = {A), Á2,..., Aq} as the frame of discernment and the 
sources of belief associated with them are obtained from the 
normalized imprecise expected payoff vector E!"?|C]. The 
modeling for computing a bba associated to A; from any 
imprecise value [a;b] C [0; 1] is simple and is done as follows: 


mi (Ai) =a, 
mi(Ai U Ai) = mi(®) =b-a 


where A; is the A;’s complement in ©. With such a conver- 
sion, one sees that Bel(A;) = a, Pl(A;) = b. The uncertainty 
is represented by length of the interval [a;b] and corresponds 
to the imprecision of the variable (here the expected payoff) 
on which the belief function for A; is defined. 


IHI. A NOVEL FUZZY-COWA-ER 


The COWA-ER has its rationality and can well process the 
MCDM under uncertainty. However the complexity and the 
computational time of the combination of COWA-ER method 
is highly dependent on the number of alternatives used for 
decision-making. When the number of alternatives is large, 
the computational cost will increase significantly. In COWA- 
ER, each expected interval is used as the information sources, 
however, these expected intervals are jointly obtained and thus 
these information sources are relatively correlated, which is 
harmful for the followed evidence combination. In this paper, 
we propose modified COWA-ER called Fuzzy-COWA-ER. 
Before presenting the principle of FCOWA-ER, we first recall 
that the pessimistic and optimistic OWA versions are used 
respectively to construct the intervals of expected payoffs for 
different alternatives as follows: 


E{C\] [cr cr] 

E[C2] [op CP] 
E[C] = : Z ; 

E[C4] (om one) 
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A. Principle of FCOWA-ER 


In COWA-ER, each row of the expected payoff EC] is used 
as information sources while in FCOWA-ER, we consider the 
two columns of E[C] as two information sources, representing 
the pessimistic and the optimistic attitude, respectively. The 
column-wise normalized expected payoff is 


Nee ARS 
Aen = Aan NES 


min max 
Ny", Ng 


where N™" € [0,1] (i = 1,...,q@ represents the normalized 
value in the column of pessimistic attitude and N"** € [0, 1] 
represents the normalized value in the column of optimistic 
attitude. The vectors [N@",..., Nm] and [N—M*, ..., Nm] 
can be seen as two fuzzy membership functions (FMFs) 
representing the possibilities of all the alternatives: Aj, ..., Ag. 


The principle of FCOWA-ER includes the following steps: 


e Step 1: normalize each column in E[C), respectively, to 
obtain E*"**4[C}; 

e Step 2: conversion of two normalized columns, i.e., 
FMFs into two bba’s mpess(.) and Mopti(.); 

e Step 3: fusion of bba’s mpess(.) and Mopti(.) with some 
combination rule; 

e Step 4: choice of the final decision based on the resulting 
combined bba. 


two 


In Step 2, we implement the conversion of the FMF into 
the bba by using a-cut approach as follows: 
Suppose the FOD is © = {Aj), A2, ..., Ag} and the FMF is 


u(Ai),i = 1,...,q, the corresponding bba introduced in [18] 
is used to generate M a-cut (0 < ay < ag <---<ay < 1), 
where M < |O| =n. 
a 5 
m(B;) = a = 


Bj, for j = 1, ..., M, (M < |O}) represents the focal element. 
For simplicity, here we set M = q and 0 < a < ag <: < 
Qq < 1 as the sort of u(A;). 


B. Example of FCOWA-ER versus COWA-ER and OWA 


Example 1: Let’s take states S = { S1, S2, 93, S4, S5} with 
the associated bba m(.) given by: 


m(S1 U S3 U S4) = 0.6 
m(S2 U S5) = 0.3 
m(Sı U S2 U S3 U S4 U S5) = 0.1 


Let’s also consider alternatives A = {41, Ao, As, A4} and 
the payoffs matrix: 


7 5 12 13 6 
_|12 10 5 n 2 

C=|9 13 3 10 9 (9) 
6 9 11 15 4 


1) Implementation of OWA: The r = 3 focal elements of 
m(.) are Xı = Sı U S3 U S4, Xo = S2 U S5 and X3 = 
S1 U S2 U S3 U S4 U S5. Xı and Xə are partial ignorance 
and X3 is the full ignorance. One considers the following 
submatrix (called bags by Yager) for the derivation of Vig, for 
i = 1,2,3,4 and k = 1,2,3. 


[xn] [eZ 12 13) 
eo eee pe 5 n 
M(X1) = E = bo. 3 10 


Mai 6 11 15 
Mie 5 6 
o [M| _ [10 2 
M(X2) = | m| = |13 9 
Map 9 4 
Mis 7 5 12 13 6 
o [Mms _ [12 10 5 n 2] _ 
M(X3)= Im =|9 13 3 10 9 =C 
Mas 6 9 11 15 4 


e Using pessimistic attitude, and applying the OWA op- 
erator on each row of M(X;,) for k = 1 to r, one 


gets finally: V(X1) = Vin Vous Vay Val = (7,5,3, 6)”, 
V(X2) = [Vi2, V22, V32, Vaz)” = (5, 2,9, 4)” and V(X3). = 
Via, Vin, Vas, Vaal” = [5,2,3,4]’. Applying formula (6) 


for i = 1,2,3,4 one gets finally the following generalized 
expected values using vectorial notation: 


[C1, C2, Os, Cal” = J? m( Xx) - V(Xe) = [6.2,8.8, 4.8, 5.2)" 


According to these values, the best alternative to take is A; 
since it has the highest generalized expected payoff. 


e Using optimistic attitude, one takes the max value of each 
row, and applying OWA on each row of M (Xx) for k = 1 tor, 


one gets: V(X1) = [Vir, Vai, Vai, Var’ = [13, 12, 10, 15)”, 
V(X2) = [Vi2, V22, Va2,Vaz]’ = (6,10, 13,9)", and 
V(X3) = [Vis, V23, V33, Vas] = [13, 12, 13, 15)”. One fi- 


nally gets [Cy,C2,C3,C4]’ = [10.9, 11.4, 11.2, 13.2)” and 
the best alternative to take with optimistic attitude is A4 since 
it has the highest generalized expected payoff. Then we have 
expected payoff as 


E[Ci] (6.2; 10.9] 

E[C2 3.8; 11.4 

E|C] = Ae Fetes, 
E[C4] [5.2; 13.2] 


2) Implementation of COWA-ER: Let’s describe in details 
each step of COWA-ER. In step 1, we divide each bound of 
intervals by the max of the bounds to get a new normalized 
imprecise expected payoff vector E/™?[C]. In our example, 
one gets: 


[6.2/13.2; 10.9/13.2] (0.47; 0.82] 

EMC) = [3.8/13.2; 11.4/13.2]]} _ | [0.29; 0.86] 
= | [4.8/13.2; 11.2/13.2]| ~ | 0.36; 0.85] 
[5.2/13.2; 13.2/13.2] (0.39; 1.00] 


In step 2, we convert each imprecise value into its bba 
according to a very natural and simple transformation [17]. 
Here, we need to consider the finite set of alternatives © = 
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{A,, A2, A3, A4} as FOD. The sources of belief associated 
with them are obtained from the normalized imprecise ex- 
pected payoff vector E/”?[C]. The modeling for computing 
a bba associated to the hypothesis A; from any imprecise 
value [a;b] C [0;1] is very simple and is done as in (7). 
where A; is the complement of A; in ©. With such a simple 
conversion, one sees that Bel(A;) = a, PI(A;) = b. The 
uncertainty is represented by the length of the interval [a;b] 
and it corresponds to the imprecision of the variable (here 
the expected payoff) on which the belief function for A; is 
defined. In the example, one gets: 


TABLE I 
BBA’S OF THE ALTERNATIVES USED IN COWA-ER. 


Alternatives A; | mi(Ai) | mi(Ai) | mi(AiU Ai) 





Ay 0.47 0.18 0.35 
Ag 0.29 0.14 0.57 
A3 0.36 0.15 0.49 
A4 0.39 0 0.61 


In step 3, we use Dempster’s rule of combination to obtain? 
the combined bba, which is listed in Table II. 


TABLE II 
FUSION OF 4 BBA’ S WITH DEMPSTER’S RULE FOR COWA-ER. 








Focal Element MDempster(-) 
Ai 0.2522 
Ag 0.1151 
A3 0.1627 
A4 0.1894 
Aı U A4 0.0087 
A2 U Aa 0.0180 
A3 U Aq 0.0137 
Aı U A U Ag 0.0368 
Aı U A3 U A4 0.0279 
A2 U A3 U Ag 0.0576 
Aı U Ag U A3 U Ag 0.1179 


In step 4, we use Pignistic Transformation to obtain the 
bba’s corresponding pignistic probability listed in Table III. 
More efficient (but complex) transformations, like DSmP, 
could be used instead [19]. Based on the pignistic probability 
obtained, the decision result is A). 


TABLE III 
PIGNISTIC PROBABILITY BASED ON COWA-ER. 


Focal Element | BetP(.) 





Ai 0.3076 
A2 0.1851 
A3 0.2275 
Ag 0.2798 


3) Implementation of FCOWA-ER: In step 1 of FCOWA- 
ER, we normalize each column in E[C], respectively. In our 
example, one gets: 


6.2/6.2; 10.9/13.2 
3.8/6.2; 11.4/13.2 
4.8/6.2; 11.2/13.2 
5.2/6.2; 13.2/13.2 


1.0000; 0.8258] 
0.6129; 0.8636] 
0.7742; 0.8485] 
0.8387; 1.0000] 


prueey [C] Š 


~ 


3 Other combination rules can be used instead to circumvent the limitations 
of Dempsters rule discuseed in [19], [20]. 


Then we obtain two FMFs, which are 

fii = [1, 0.6129, 0.7742, 0.8387]; 

u2 = (0.8258, 0.8636, 0.8485, 1.0000]. 

In step 2, by using a-cut approach, u and uz are converted 
into two bba’s mpess(.) and Mopti(-) as listed in Table IV. 
In Step 3, we use Dempster’s rule* to combine MPess(.) and 





TABLE IV 
THE TWO BBA’S TO COMBINE OBTAINED FROM FMEFS. 
Focal Element | Mmpess(-) | Focal Element | mopti(.) 
A, U A2U Ag U Ag 0.6129 A, U A2U Ag U Ag 0.8257 
A, U A3 U Ag 0.1613 A> U A3 U Ag 0.0227 
Aı U A4 0.0645 A2 U A4 0.0152 
Aj 0.1613 A4 0.1364 


Mopti(.) to get MDempster(-) as listed in Table V. 


TABLE V 
FUSION OF TWO BBA’ S WITH DEMPSTER’S RULE FOR FCOWA-ER. 








Focal Element M Dempster (.) 
Al 0.1370 
Aa 0.1227 
Aı U A4 0.0549 
A2 U A4 0.0096 
A3 U Aq 0.0038 
Aı U A3 U Ag 0.1370 
A2 U A3 U A4 0.0143 
A; U A2 U A3 U Ag 0.5207 


In step 4, we use again the Pignistic Transformation to 
get the pignistic probabilities listed in Table VI. Based on 


TABLE VI 
PIGNISTIC PROBABILITY BASED ON FCOWA-ER. 


Focal Element | BetP(.) 





Aj 0.3403 
Ag 0.1397 
A3 0.1826 
A4 0.3374 


these probabilities, the decision result is also A;. The decision 
results of COWA-ER and FCOWA-ER are the same. 


IV. ANALYSES ON FCOWA-ER 


A. On computational complexity 


In FCOWA-ER, only two bba’s are involved in the combi- 
nation. That is to say only one combination step is needed. 
Whereas in the original COWA-ER, if there exists q alterna- 
tives, there should be q — 1 evidence combination operations 
to do. Furthermore, the bba’s obtained in the FCOWA-ER 
by using a-cut are consonant support (nested in order). This 
will bring less computational complexity when compared with 
the bba’s generated in the original COWA-ER. In summary, 
it is clear that the new proposed FCOWA-ER has lower 
computational complexity. 


4In fact, and more generally the choice of a rule of combination is entirely 
left to the preference of the user in our FCOWA-ER methodology. 
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B. On physical meaning 


In this new FCOWA-ER approach, the two different infor- 
mation sources are pessimistic OWA and optimistic OWA. The 
combination result can be regarded as a tradeoff between these 
two attitudes. the physical or practical meaning is relatively 
clear. Furthermore, if the decision-maker has preference on 
pessimistic or optimistic attitude, one can use discounting in 
evidence combination to satisfy one’s preference. We can set 
the preference of pessimistic attitude as Àp and set the the 
preference of optimistic attitude as Ao. Then the discounting 
factor can be obtained as 


{ B = Ao/Ap; 
B — Ap/ Ao, 


Then according to the discounting method [16], one will take: 


mg(X)=8-m(X), for xX #0 
ma(@) = 8-m(O) + (1 — 8) 


If Ao < Ap, then m/(.) in (11) should be mopzti(.); If 
Ap < Ao, then m(.) in (11) should be mpess(.); By using 
the discounting and choosing a combination rule, the decision 
maker’s has a flexibility in his decision-making process. 


Ao L Àp 
Ap < Ào (10) 


a1) 


C. On management of uncertainty 


In the FCOWA-ER, we first define the bba vertically 
taking into account the uncertainty between alternatives for 
the pessimist attitude and for the optimistic attitude. Then 
we combine two columns. The uncertainty incorporated in 
the FMF obtained, which represents the possibility of each 
alternative to be chosen as the final decision result. Based 
on a-cut approach, the FMF is transformed into bba. The 
uncertainty is thus transformed to the bba. Although based on 
each column, only the information of pessimistic or optimistic 
is used, the combination operation followed can use both the 
two information sources (pessimistic and optimistic attitudes). 
Thus the available information can be fully used in FCOWA- 
ER. In COWA-ER, the modeling for each row (interval) takes 
into account the true uncertainty one has on the bounds of 
payoff for each alternative, then after modeling each bba 
m,;(.), one combines them ”vertically” to take into account 
the uncertainty between alternatives. 

Although the ways to manage the uncertainty incorporated 
in are different for COWA-ER and FCOWA-ER, they both 
utilize (differently) the whole available information. 


D. On robustness to error scoring 


Based on many experiments, we have observed that almost 
all the decision results given by FCOWA-ER agree with 
COWA-ER results and are rational. However when the dif- 
ference among the values in payoff matrix is significant, the 
COWA-ER can yield to wrong decisions whereas FCOWA-ER 
yields to rational decisions as illustrated in Example 2 below. 


5when using the same rule of combination in steps 3, and the same 
probabilistic transformation in steps 4. 


Example 2 Let’s take states S = { S1, S2, S3, S4, S5} with 
associated bba m(.S1 U S2 U S3 U S4 U S5) = 1, and consider 
alternatives A = {A1 , Ao, As, A4} and the payoffs matrix: 


12 11 10 120 
9 10 6 110 3 

C=|7 13 5 100 6 (12) 
6 2 3 150 4 


We see that the difference between max value and min value 
of each line is significant. For example, in the fourth row of 
C, only S4 brings extremely high score for A4 whereas other 
states bring homogeneous low score values for A4. Whatever 


state of nature we consider S1, So,..., or Ss, Aj is either the 
top | or top 2 choice according to the ranks of the alternatives 
for states S;, i = 1,...,5 as shown below: 
A, Ao Az Aa 
Si 1 2 3 4 
So 2 3 1 4 
S3 1 2 3 4 
S4 2 3 4 1 
S5 1 4 2 3 


So, intuitively, according to the principle of majority voting, 
the decision result should be A, but not A4. According to 
rank-level fusion, the decision result should also be Aj. 

The expected payoffs are: 
, 120] 
110] 


, 100] 
, 150] 


e Using COWA-ER, one has 


(0.0467, 
(0.0200, 
(0.0333, 
(0.0133, 


EIP [C] = 


0.8000] 
0.7333] 
0.6667] 
1.0000] 


The bba’s to combine are listed in Table VII and the 
combination results by using Dempster’s rule are in Table VIII. 


TABLE VII 


EXAMPLE 2: BBA’S OF THE ALTERNATIVES USED IN COWA-ER. 


Alternatives A; | mil Ai) | mj;(Ai) | m;( Ai U A;) 





Aj 0.0467 0.2000 
A2 0.0200 0.2667 
A3 0.0333 0.3333 
A4 0.0133 0 


0.7533 
0.7133 
0.6334 
0.9867 


The pignistic probabilities listed in IX indicate that the 


decision result® of COWA-ER is A4. 


e Using FCOWA-ER, one has 


[1.0000, 
(0.4286, 
(0.7143, 
(0.2857, 


Buzzy [C] = 


0.8000] 
0.7333] 
0.6667] 
1.0000] 


In FCOWA-ER, the bba’s to combine are listed in Table X 
and their Dempster’s combination is listed in Table XI. 
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TABLE VII 
EXAMPLE 2: DEMPSTER’S FUSION OF 4 BBA’S FOR COWA-ER. 








Focal Element ™ Dempster (.) 
Ay 0.0438 
AG 0.0182 
Pi 0.0309 
Ai 0.0297 
ALU Aa 0.0664 
‘Ap U Ag 0.0471 
A3 U A4 0.0335 
ALU A2 U A4 0.1775 
AUAU Aq 0.1261 
A2UA3U A4 0.0895 
Ay U A2 U A3 U Ag 0.3373 


TABLE IX 
EXAMPLE 2: PIGNISTIC PROBABILITY BASED ON COWA-ER. 


Focal Element | BetP(.) 





Ai 0.2625 
A2 0.2152 
A3 0.2038 
As 0.3185 
TABLE X 


EXAMPLE 2: BBA’ S OF THE ALTERNATIVES USED IN FCOWA-ER. 





Focal Element | mPess(.) | Focal Element | monti(.) 
A, U A2U Ag U Ag 0.2857 A, U A2 U Ag U Ag 0.6667 
A, U A2 U Ag 0.1429 Aı U A3 U A4 0.0667 
A, U A3 0.2857 A3 U A4 0.0667 
Aj 0.2857 Aj 0.1999 
TABLE XI 


EXAMPLE 2: DEMPSTER’S FUSION OF THE TWO BBA’S FOR FCOWA-ER. 





Focal Element ™M Dempster (.) 
Al 0.3223 
Ag 0.0667 
A, U Ag 0.0111 
A, U A3 0.2222 
A, U A4 0.0222 
A, U A2U A3 0.1111 
A, U A2 U A4 0.0222 
A; U A2 U A3 U Ag 0.2222 





The pignistic transformation of M Dempster (-) yields to the 
pignistic probabilities listed in Table XII. 


TABLE XII 
EXAMPLE 2: PIGNISTIC PROBABILITY BASED ON FCOWA-ER. 


Focal Element | BetP(.) 





Ay 0.5500 
A2 0.1056 
A3 0.2037 
Ai 0.1407 


Based on the pignistic probabilities, the decision result 
obtained with FCOWA-ER is now Aj, which is the correct 
one. In this example, FCOWA-ER shows its robustness when 
compared with COWA-ER. 


E. On the normalization procedures 


In fact, there exist at least three normalization procedures 
that we briefly recall below. Suppose x is the original vector 


based on max of BetP(.). 
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input, x; represents the zth dimension of x. y; represents the 
ith dimension of the normalized vector y, then we examine 
the following three types of normalization: 

1) Type I: y; = x;/max(x) 

2) Type Il: y; = (a; — min(x)) /(max(x) — min(x)) 

3) Type M: y; = vi/ D7, (2,) 

To verify wether the decision results of COWA-ER and 
the new FCOWA-ER can be affected by the normalization 
procedure, we did some tests as follows. We randomly 
generate payoff matrices and use all the three types of 
normalization approaches in COWA-ER and FCOWA- 
ER respectively. Then we make comparisons among the 
results obtained. We repeat the experiment 50 times (50 
Monte-Carlo runs). Based on our simulation results, we 
find that normalization approaches can affect the decision 
results of COWA-ER and FCOWA-ER, although the ratio of 
disagreement among different normalization approach is small 
(about 1 to 2 times of disagreement out of 50 experiments in 
average). Example 3 is a case where the disagreement occurs 
due to the different types of normalization. 


Example 3: We consider the following payoff matrix 


15 5 30 
_|5 40 40 
C= 40 30 30 
15 10 40 


The bba is m(X;) = 0.5439, m(X2) = 0.3711, 
m(X3) = 0.0849, where Xı = So U S3, Xo = Si U So and 


X3 = 5, U S3. 


e Using COWA-ER, based on normalization Type I, Type 
II and Type III, we can obtain the corresponding expected 
payoffs 


(0.1462, 0.6108] 

= [0.6009, 1.0000] 

Er [C] = (0.7500, 0.8640] 
[0.2606, 0.7680] 

[0.0000, 0.5442] 

T [0.5326, 1.0000] 
Err[C] = [0.7072, 0.8407] 
[0.1340, 0.7283] 

[0.0292, 0.1221] 

T [0.1202, 0.2000] 
Errr[C] = (0.1500, 0.1728] 
(0.0521, 0.1536] 


Then we obtain the pignistic probabilities listed in Table 
XIII. From Table XIII, one sees that the decision result with 
Type INI normalization is Ag while those of Type I and Type 
II yields A3. 


TABLE XII 
EXAMPEL 3: PIGNISTIC PROB. FOR TYPES I, II & III AND COWA-ER. 


Focal Element | BetPr(.) | BetPrr(.) | BetPrrr(.) 





Al 0.0587 0.0388 0.1690 
Ag 0.3203 0.3324 0.3180 
A3 0.5223 0.5444 0.2920 
A4 0.0987 0.0844 0.2210 
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e Using FCOWA-ER, based on normalization of Type I, 
Type II and Type III, we get the corresponding expected 
payoffs 


[0.1950, 0.6108] 
= (0.8013, 1.0000] 
Er[C] = [1.0000, 0.8640] 
[0.3475, 0.7680] 
[0.0000, 0.0000] 
= (0.7531, 1.0000] 
Err[C] = [1.0000, 0.6506] 
[0.1894, 1.4040] 
[0.0832, 0.1884] 
= (0.3419, 0.3084] 
Errr[C] = [0.4267, 0.2664] 
(0.1483, 1.2368] 


Then we can get the pignistic probabilities listed in Table 
XIV. From Table XIV, one sees that the decision with Type II 
normalization is A> while those of Type I and Type II yields 
A3. 


TABLE XIV 
EXAMPLE 3: PIGNISTIC PROBABILITY BASED ON FCOWA-ER 


Focal Element | BetPr(.) | BetPrr(.) | BetPrrr(.) 





Aı 0.0306 0.0000 0.0306 
Ag 0.4118 0.5421 0.4118 
A3 0.4763 0.4300 0.4763 
A4 0.0813 0.0279 0.0813 


So in a little percentage of cases, we must be cautious when 
choosing a normalization procedure and so far there is no 
clear theoretical answer for the choice of the most adapted 
normalization procedure. We prefer the Type I normalization 
procedure since it is very simple and intuitively appealing. 


V. CONCLUSION 


In this paper, we have proposed a fuzzy cautious OWA 
method using evidential reasoning (FCOWA-ER) to implement 
the multi-criteria decision making, where evidence theory, 
fuzzy membership functions and OWA are used jointly. This 
method has less computational complexity and has clearer 
physical meaning. Furthermore, it is more robust to the error 
scoring in MCDM. Experimental results and related analyses 
show that our FCOWA-ER is interesting and flexible because 
its three main specifications can be adapted easily for working: 
1) with other rules of combination than Dempster’s rule, 
2) with other probabilistic transformations than BetP, and 
3) with different normalization procedures. Of course the 
performances of FCOWA-ER depend on the choice of these 
three main specifications taken by the MCDM system designer. 
The method to generate the bba from the FMF based on a- 
cut depends on the selection of the parameter vector of a. 
The impact of the choice of the specifications as well as a 
to evaluate the performance of FCOWA-ER will be further 
analyzed in our future works. This paper was devoted to the 
theoretical developemnt of FCOWA-ER and its evaluation for 
applications to real MCDM problems is part of our future 
research works. 
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Soft ELECTRE TRI Outranking 
Method Based on Belief Functions 


Jean Dezert 
Jean-Marc Tacnet 


Abstract—Main decisions problems can be described into 
choice, ranking or sorting of a set of alternatives. The classical 
ELECTRE TRI (ET) method is a multicriteria-based outranking 
sorting method which allows to assign alternatives into a set 
of predetermined categories. ET method deals with situations 
where indifference is not transitive and solutions can sometimes 
appear uncomparable. ET suffers from two main drawbacks: 
1) it requires an arbitrary choice of \-cut step to perform the 
outranking of alternatives versus prof les of categories, and 2) an 
arbitrary choice of attitude for fnal assignment of alternatives 
into the categories. ET f nally gives a f nal binary (hard) assign- 
ment of alternatives into categories. In this paper we develop 
a soft version of ET method based on belief functions which 
circumvents the aforementioned drawbacks of ET and allows to 
obtain both a soft (probabilistic) assignment of alternatives into 
categories and an indicator of the consistency of the soft solution. 
This Soft-ET approach is applied on a concrete example to show 
how it works and to compare it with the classical ET method. 


Keywords: ELECTRE TRI, information fusion, belief 
functions, outranking methods, multicriteria analysis. 


I. INTRODUCTION 


Multi-criteria decision analysis aims to choose, sort or rank 
alternatives or solutions according to criteria involved in the 
decision-making process. The main steps of a multi-criteria 
analysis consist in identifying decision purposes, def ning 
criteria, eliciting preferences between criteria, evaluating al- 
ternatives or solutions and analyzing sensitivity with regard to 
weights, thresholds, etc. A difference has to be done between 
total and partial aggregation methods: 


e Total aggregation methods such as the Multi-Attribute 
Utility Theory (M.A.U.T.) [1], [2] synthesizes in a unique 
value the partial utility related to each criterion and 
chosen by the decision-maker (DM). Each partial utility 
function transforms any quantitative evaluation of cri- 
terion into an utility value. The additive method is the 
simplest method to aggregate those utilities. 

e Partial aggregation methods which are not based on the 
principle of preference transitivity. The ELECTRE TRI 
(ET) outranking method inspired by Roy [3] and f nalized 
by Yu in [4] belongs to this family and it is the support 
of the research work presented in this paper. 


ELECTRE TRI (electre tree) is an evolution of the 
ELECTRE methods introduced in 1960’s by Roy [5] which 


Originally published as Dezert J., Tacnet J.-M., Soft ELECTRE TRI outranking 
method based on belief functions, Proc. Of Fusion 2012, Singapore, July 2012, 
and reprinted with permission. 


remain widespread methods used in operational research. 
The acronym ELECTRE stands for ”ELimination Et Choix 
Traduisant la REalité (Elimination and Choice Expressing the 
Reality). ET is simpler and more general than the previ- 
ous ELECTRE methods which have specif cities given their 
context of applications. A good introduction to ET methods 
with substantial references and detailed historical survey can 
be found in [6] and additional references in [7]. This paper 
proposes a methodology inspired by the ET method able to 
help decision based on imperfect information for soft assign- 
ment of alternatives into a given set of categories def ned by 
predeterminate prof les. Our method, called ’Soft ELECTRE 
TRI” (or just SET for short), is based on belief functions. It 
allows to circumvent the problem of arbitrary choice of A-cut 
of the outranking step of ET, and the ad-choice of attitude 
in the fnal assignment step of ET as well. Contrariwise to 
ET which solves the hard assignment problem, SET proposes 
a new solution for solving the assignment problem in a soft 
manner. This paper is organized as follows. In Section II, we 
recall the principles of ET method with its main steps. In 
Section III, we present in details our new SET method with 
emphasize on its differences with classical ET. In Section IV, 
we apply ET and SET on a concrete example proposed by 
Maystre [8] to show how they work and to make a comparison 
between the two approaches. Section V concludes this paper 
and proposes some perspectives of this work. 


Il. THE ELECTRE TRI (ET) METHOD 


Outranking methods like the ET method presented in 
this section are relevant for Multi-Criteria Decision Analysis 
(MCDA) [6] when: 

e alternatives are evaluated on an ordinal scale; 

e criteria are strongly heterogeneous by nature (e.g. com- 

fort, price, pollution); 

e compensation of the loss on one criterion by a gain on 
an another is unacceptable; 

e small differences of evaluations are not individually 
signif cant while the accumulation of several of these 
differences may become signif cant. 

We are concerned with an assignment problem in complex 

situations where several given alternatives have to be assigned 
to known categories based on multiple criteria. The categories 
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are defned by profles values (bounds) for each criteria 
involved in the problem under consideration as depicted in 
Fig. 1 below. 



















ad Profile b; bhi  Oh+2 
Criterion a | IA 
Category C'h P | Chao | Ch+3 
CE evana OG Ga ol. 
(dn) glai) ; È ) 
Cj-1 gj-1(-) 
Iibr) 93-1 (ai) 
Cj gC) 
gn) g(a) 
Ch gi+1(-) 
9j+1(bn) gjs (ai) 
Cj42 , gj+2(-) 
gj+2(bn) $ Gi+2(a ) 
A i j évaluation ( ) 
ng BB g . 
l In (bh) Ing (4%) $ ne 
Evaluations of 
Alternative 4; 


profiles, alternatives 
for a given criterion 


Figure 1: ET aims to assign a category to alternatives. 


The ET method is a multicriteria-based outranking sorting 
method proposing a hard assignment of alternatives a; in 
categories Ch. More precisely, the alternatives a; € A, 
i = 1,...,nq@ are committed to ordered categories Cp € C, 
h = 1,...,np according to criteria cj, j € J = {1,..., ng}. 
Each category Cn is delimited by the set of its lower and 
upper limits 6,1 and bap with respect to their evaluations 
g;(bn—1) and g;(b;,) for each criterion c; (g;(.) represents 
the evaluations of alternatives, profles for a given criterion 
cj). By convention, bo < bı... < bnp. bo is the lower 
(minimal) profle bound and bn, is the upper (maximal) 
profle bound. The overall profle b, is defned through the 
set of values {g1 (ba), g2(br), - - - , Jn, (br) } represented by the 
vertical lines joining the yellow dots in Fig. 1. The outranking 
relations are based on the calculation of partial concordance 
and discordance indices from which global concordance and 
credibility indices [4], [9] are derived based on an arbitrary 
A-cut strategy. The fnal assignment (sorting procedure) of 
alternatives to categories operated by ET is a hard (binary) 
assignment based on an arbitrary selected attitude choice 
(optimistic or pessimistic). ET method can be summarized by 
the following steps: 


e ET-Step 1: Computation of partial concordance indices 
Cj (ai, bn) and c; (bp, a;)), and partial discordances indices 
dj (ai, bn) and d;(bn, ai)); 

ET-Step 2: Computation of the global (overall) concor- 
dance indices c(a;, bn) and c(bh, ai) to obtain credibility 
indices p(a;, bn) and p(bn, ai); 

e ET-Step 3: Computation of the fuzzy outranking rela- 
tion grounded on the credibility indices p(ai, bn) and 
p(bp, ai); and apply a A-cut to get the crisp outranking 
relation; 

ET-Step 4: Final hard (binary) assignment of a; into Cn 
is based on the crisp outranking relation and in adopting 
either a pessimistic (conjunctive), or an optimistic (dis- 


junctive) attitude. 


Let’s explain a bit more in details the steps of ET and the 
computation of the indices necessary for the implementation 
of the ET method. 


A. ET-Step 1: Partial indices 


In ET method, the partial concordance index c;(a;, bn) 
(resp. cj (bn, a;)) expresses to which extent the evaluations of 
ai and bp (respectively ba and a;)) are concordant with the 
assertion “a; is at least as good as bp” (respectively “bp, is 
at least as good as a;”). c;(ai, bn) € [0,1], based on a given 
criterion g;(.), is computed from the difference of the criterion 
evaluated for the prof le ba, and the same criterion evaluated 
for the alternative a;. If the difference! g;(b,) — g;(a;) is less 
(or equal) to a given indifference threshold q,;(b;,) then a; and 
bn are considered indifferent based on the criterion g;(.). If the 
difference g; (bn) —g; (ax) is strictly greater to given preference 
threshold p;(b,) then a; and ba are considered different 
based on gj(.). When gj(bn) — gj(ai) € [aj(bn), p;(bn)]. 
the partial concordance index c;(a;, bp) is computed from 
a linear interpolation corresponding to a weak difference. 
Mathematically, the partial concordance indices c;(a;, bn) and 
cj(bn, ai) are obtained by: 


0 if gj(bn) —9j(ai) > pj(bn) 


cjlai bn) <1 R 9; (Pa) aglaw < qj(bn) (1) 
5 (ai) +; (bn )—95 (bn) i 
bate otherwise 
and 
0 if gj(ai)—gj(bn) > pj (bn) 
cj(bn,ai) 41 if gj(ai)— gj(bn) < qj(bn) (2) 


93 (bn) +P; (bn) —9; (ai) 


Tce otherwise. 


The partial discordance index d;(a;,b;,) (resp. dj (bn, a;)) 
expresses to which extent the evaluations of a; and bp (resp. 
bn and a;) is opposed to the assertion ”a; is at least as good as 
bn” (resp. ”bpn is at least as good as a;’’). These indices depend 
on a possible veto condition expressed by the choice of a veto 
threshold v;(ba) (such as vj(bah) > pj(bn) > 9;(bn) > 0) 
imposed on some criterion g;(.). They are def ned by [4], [9]: 


0 if gj(bn) — gj (ai) < pj (bn) 


djlai bn) 41 if gj(bn)— gj (aa) = vj(bn) (3) 
j (bn )—g5 (ai)—pj(b : 
r otherwise 
and 
0 if gj(ai) — 9;(bn) < pj(bn) 
dj(bn,ai) = 41 if gj(ai)— gj(bn) > vj (bn) (4) 


gj (ai)—gj (bn) —P5 (bn) 4 

otherwise. 

05 (On) —P; (On) 

'For convenience, we assume here an increasing preference order. A 

decreasing preference order [9] can be managed similarly by multiplying 
criterion values by -1. 
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B. ET-Step 2: Global concordance and credibility indices 


e The global concordance indices: The global concor- 
dance index c(aj, bn) (respectively c(bp,, a;)) expresses to 
which extent the evaluations of a; and bp on all criteria 
(respectively b, and a;) are concordant with the asser- 
tions ”a; outranks bp” (respectively ”b;, outranks ai”). 
In ET method, c(ai, bn) (resp. c(bp,a;)) is computed 
by the weighted average of partial concordance indices 
cj (ai, bn) (resp. c;(bn, a;)). That is 


= Yom ai, bn) (5) 


c(aj, bp) 


and 
Ng 


= wjcj(bn, ai) 


where the weights w; € [0,1] represent the relative 
importance of each criterion g;(.) in the evaluation of 
the global concordance indices. The weights add to 
one. Since all c;(a;, bn) and c;(a;, bn) belong to [0; 1], 
clai, bn) and c(bp, ai) given by (5) and (6) also belong 
o (0; 1]. 

The global credibility indices: The degree of credibility 
of the outranking relation denoted as p(a;, bn) (respec- 
tively p(bp, a;)) expresses to which extent ”a; outranks 
ban” (respectively ”b, outranks a;”) according to the 
global concordance index c(a;,b;,) and the discordance 
indices dj(a;, bn) for all criteria (respectively c(bp, ai) 
and d;(bp,a;)). In ET method, these credibility indices 
plai, bn) (resp. p(bp,a;)) are computed by discounting 
(weakening) the global concordance indices c(a;, by) 
given by (5) (resp. c(bp,, ai) given by (6)) by a discounting 
factor a(a;, bn) in [0;1] (resp. a(bn, a;)) as follows: 


p(ai, bn) = 

pbn, ai )= 
The discounting factors a(a;, bn) and a(bn, ai) are de- 
fned by [9], [10]: 


(6) 


c(bn, ai) 


clai, bn )a(az, bp) 


c(bn, aj)a(bn, ai) e 


1 if Vi =90 
alai, bn) = Il—d, a;,b 3 (8) 

Ilev: — if Vi “ o 

1 if Vo=90 
a(bn, ai) = 1—dj(bn,ai ; (9) 

ILjev. a if Vo F 0 


where V; (resp. V2) is the set of indexes 7 where the 
partial discordance indices d;(a;, bn) (reps. dj (bn, ai)) is 
greater than the global concordance index c(aj, bn) (resp. 
c(bp, @;)), that is: 


Vi = {j € Jd; (ai, bn) > clai, bn)} (10) 


Vo 2 {j E J|dj(bn, ai) > c(bp, ai) } (11) 


C. ET-Step 3: Fuzzy and crisp outranking process 


Outranking relations result from the transformation of fuzzy 
outranking relation (corresponding to credibility indices) into a 
crisp outranking relation? S done by means of a \-cut [9]. A is 
called cutting level. A is the smallest value of the credibility in- 
dex p(a;, bn) compatible with the assertion ”a; outranks bp”. 
Similarly A is the smallest value of the credibility index 
p(bh, ai) compatible with the assertion ”b;, outranks a;”. In 
practice the choice of À value is not easy and is done arbitrary 
or based on a sensitivity analysis. More precisely, the crisp 
outranking relation S is def ned by 


plai, bn) > AX = a; S bp, (12) 
p(bp, ai) > A = bn S aj 


Binary relations of preference (>), indifference (J), incom- 
parability (R) are def ned according to (13): 


ajlbp,, <= > a; S bp and bp, S a; 

ai > bn => a; S bn and not bp S a; 
ai < bn = not a; S bn and bp S a; 
a;Rb;, <> not a; S bp and not bp S a; 


(13) 


D. ET-Step 4: Hard assignment procedure 


Based on outranking relations between all pairs of alterna- 
tives and prof les of categories, two attitudes can be used in 
ET to assign each alternative a; into a category Ch [6]. These 
attitudes yields to a hard assignment solution where each 
alternative belongs or doesn’t belong to a category (binary 
assignment) and there is no measure of the conf dence of the 
assignment in this last step of ET method. The pessimistic and 
optimistic hard assignments are realized as follows: 

e Pessimistic hard assignment: a; is compared with bk, 

bk—1, bk—2, ..., until a; outranks b, where h < k. The 
alternative a; is then assigned to the highest category C'h, 
that is a; > Ch, if plai, bn) > À. 
Optimistic hard assignment: a; is compared successively 
to by, b2, ...bp, ... until bp outranks a;. The alternative 
a; is assigned to the lowest category Ch, a; —> Ch, for 
which the upper prof le b, is preferred to a;. 


II. THE NEW SOFT ELECTRE TRI (SET) METHOD 


The objective and motivation of this paper are to improve 
the appealing ET method in order to provide a soft assignment 
procedure of alternatives into categories, and to eliminate the 
drawback concerning both the choice of A-cut level in ET-Step 
3 and the choice of attitude in ET-Step 4. Soft assignment 
ref ects the conf dence one has in the assignment which can 
be a very useful property in applications requiring multi 
criteria decision analysis. To achieve such purpose and due to 
long experience in working with belief functions (BF), it has 
appeared clearly that BF can be very useful for developing a 
*soft-assignment” version of the classical ET presented in the 
previous section. We call this new method the ” Soft ELECTRE 


?It is denoted S because Outranking translates to ”Surclassement” in 


French. 
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TRI” method (SET for short) and we present it in details in 
this section. 

Before going further, it is necessary to recall brief y the def- 
inition of a mass of belief m(.) (also called basic belief assign- 
ment, or bba), a credibility function Bel(.) and the plausibility 
function Pl(.) defned over a fnite set © = {0),02,...,An} 
of mutually exhaustive and exclusive hypotheses. Belief func- 
tions have been introduced by Shafer in his development of 
Dempster-Shafer Theory (DST), see [11] for details. In DST, 
© is called the frame of discernment of the problem under 
consideration. By convention the power-set (i.e. the set of all 
subsets of ©) is denoted 2° since its cardinality is 219|. A 
basic belief assignment provided by a source of evidence is a 
mapping m/(.) : 2° — (0, 1] satisfying 


So m(X)=1 


XE2° 


m(O)=0 and (14) 


The measures of credibility and plausibility of any proposition 
X € 2° are defned from m/(.) by 


Bel(X) = X m(Y) 
YCX 
Ye2° 


dS mY) 


YNX#0 
Yye2? 


(15) 


Pix) ê (16) 


Bel(X) and PI(X) are usually interpreted as lower 
and upper bounds of the unknown probability of X. 
U(X) = PI(X) — Bel(X) refects the uncertainty on X. 
The belief functions are well adapted to model uncertainty 
expressed by a given source of evidence. For information 
fusion purposes, many solutions have been proposed in 
the literature [12] to combine bba’s effciently for pooling 
evidences arising from several sources. 


As for the classical ET method, there are four main steps 
in our new SET method. However, the SET steps are different 
from the ET steps. The four steps of SET, that are actually 
very specifc and improves the ET steps, are: 


e SET-Step 1: Computation of partial concordance indices 
c;(ai,bn) and c;(bp,a;), partial discordances indices 
d;(a;,b,) and dj(bn, a;i), and also partial uncertainty 
indices u;(a;,b,) and u,;(b,,a;) thanks to a smooth 
sigmoidal model for generating bba’s [13]. 

e SET-Step 2: Computation of the global (overall) con- 
cordance indices c(a;, ba), c(bn, ai), discordance indices 
d(a;, bn), d(bn,ai), and uncertainty indices u(a;, bp), 
u(bn, ai); 

e SET-Step 3: Computation of the probabilized outranking 
relations grounded on the global indices of SET-Step 2. 
The probabilization is directly obtained and thus elimi- 
nates the arbitrary \-cut strategy necessary in ET. 

e SET-Step 4: Final soft assignment of a; into Ch based 
on combinatorics of probabilized outranking relations. 


Let’s explain in details the four steps of SET and the 
computation of the indices necessary for the implementation 
of the SET method. 


A. SET-Step 1: Partial indices 


In SET, a sigmoid model is proposed to replace the original 
truncated trapezoidal model for computing concordance and 
discordance indices of the ET method. The sigmoidal model 
has been presented in details in [13] and is only brief y recalled 
here. We consider a binary frame of discernment? © £ {c, c} 
where c means that the alternative a; is concordant with the 
assertion ”a; is at least as good as prof le b;,”, and ¢ means that 
the alternative a; is opposed (discordant) to this assertion. We 
can compute a basic belief assignment (bba) mjp,(.) def ned 
on 2° for each pair (a;,bn). min(.) is defned from the 
combination (fusion) of the local bba’s m?, (.) evaluated from 
each possible criteria g;(.) as follows: m4, (.) = [m1 ®ma](.) 
is obtained by the fusion* (denoted symbolically by @) of the 
two following simple bba’s def ned by: 

focal element 


maıl.) m2(.) 





c seste g 0 


0 f-sz,te(g) 
1— fs.,te(g) 1— fHsz,t2(9) 


Table I: Construction of mj (.) and mg(.). 


where fs (g) = 1/(1+e~*49-) is the sigmoid function; g is 
the criterion magnitude of the alternative under consideration; 
t is the abscissa of the infection point of the sigmoid. The 
abscisses of infection points are given by te = gj(bn) — 
3(Dj(bn) + qj(bn)) and te = gj(bn) — 3(p; (bn) + vi (ba)) 
and the parameters s, and sz are given by? se = 4/(p;(bn) — 
qj(bn)) and sz = 4/(v; (bn) — pj (bn)). 

From the setting of threshold parameters p; (bn), qj (bn) and 
v,; (bp) (the same as for ET method), it is easy to compute the 
parameters of the sigmoids (te, Se) and (tz, sz), and thus to 
get the values of bba’s mı(.) and m2(.) to compute m7, (.). 
We recommend to use the PCRS fusion rule since it offers 
a better management of conficting bba’s yielding to more 
specif c results than with other rules. Based on this sigmoidal 
modeling, we get now from m?,(.) a fully consistent and 
eff cient representation of local concordance c;(a;, ba), local 
discordance d;(a;, bn) and the local uncertainty u;(a;, bn) by 
considering: 
mip (c) [0, 1] 

Min (c) [0, 1] 
1 (euz) € [0,1]. 


Cj(ai, bn) 
dj (ai, bn) 
uj(ai, bn) = 


l> [I> 


E 
€ (17) 


Of course, a similar approach must be adapted (not 
reported here due to space limitation restraint) to 


3Here we assume that Shafer’s model holds, that is c N Z = 0. 

4with averaging rule, PCRS rule, or Dempster-Shafer rule [14]. 

>The coeff cient 4 appearing in sce and sz expressions comes from the fact 
that for a sigmoid of parameter s, the tangent at its infection point is s/4. 

®see [15] for details on PCRS with many examples. 
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compute cj(bh,ai) = mi (c), dj(bn,ai) = mi,(@) and 
uj (bp, ai) = we (cU Z). 


Example 1: Let’s consider only one alternative a; and g;(.) 
in range [0, 100], and let’s take g;(b;,) = 50 and the following 
thresholds: q;(b;,) = 20 (indifference threshold), p; (bn) = 25 
(preference threshold) and v;(b;,) = 40 (veto threshold) for the 
profle bound ba. Then, the infection points of the sigmoids 
filg) È feeste(g) and fo(g) È f-seyte(g) have the following 
abscisses: te = 50 — (25 + 20)/2 = 27.5 and tz = 50 — (25+ 
40)/2 = 17.5 and parameters: se = 4/(25 — 20) = 4/5 = 0.8 
and sz = 4/(40 — 25) = 4/15 = 0.2666. The construction 
of the consistent bba m7, (.) is obtained by the PCR5 fusion 
of the bba’s mı(.) and m2(.) given in Table I. The result is 
shown in Fig. 2. 


Sigmoid model with increasing preferences 




















Figure 2: mi, (.) corresponding to partial indices. 


The blue curve corresponds to c;(a;,bn), the red plot 
corresponds to d;(a;, bn) and the green plot to u; (a;i, bn) when 
Jj (ai) varies in [0; 100]. cj (bn, ai), dj (bn, ai) and Uj (bh, ai) 
can easily be obtained by mirroring (horizontal f ip) the curves 
around the vertical axis at the mid-range value g;(a;) = 50. 


B. SET-Step 2: Global indices 


As explained in SET-Step 1, the partial indices are en- 
capsulated in bba’s m},(.) for alternative a; versus prof le 
bn (aivs.bp), and encapsulated in bba’s mj,,(.) for prof le 
by, versus alternative a; (b,vs.a;). In SET, the global indices 
c(a;, bp), d(ai, bn) and u(a;,bp,) are obtained by the fusion 
of the ng bba’s m%, (.). Similarly, the global indices c(bp, ai), 
d(bn,ai) and u(bp,a;) are obtained by the fusion of the ng 


bba’s m? ,(.). More precisely, one must compute: 











min(.) = [Min D Mh ©... min |) 
ee = [m] P m... m7 \(-) WB 


To take into account the weighting factor w; of the criterion 
valued by g;(.), we suggest to use as fusion operator ® either: 
e the weighting averaging fusion rule (as in ET method) 
which is simple and compatible with probability calculus 

and Bayesian reasoning, 
e or the more sophisticated operator def ned by the PCR5 
fusion rule adapted for importance discounting presented 


in details in [16] which belongs to the family of non- 
Bayesian fusion operators. 
Once the bba’s m;n(.) and Mp:i(.) have been computed, the 
global indices are def ned by: 


clai, bn) & min(c)a(ai, bn) 
d(ai, bn) = Min (E) Blai, bn) 
u(ai, bn) £]— c(ai, bn) — d(ai, bh). 


(19) 


The discounting factors a(a;, bn) and 6(a;, bn) are def ned by 


(ai, by) 4 1 if Va = a 
Alai, On) = 1—d;(a;,bp) 5 
[Leva Thro # Va re 
1 if Vg= ) 
Blai, bn) = 1c; (a4,bp) i (21) 
Iljev, ae if Ve #0 
: Va £ {j € J|dj (ai, bn) > min(c)} 
with oe 22 
here ediet.aismaoe 


C(bp, ai), A(bp,a;) and u(b;,,a;) are similarly computed 
using dual formulas of (19)}-(22). 


The belief and plausibility of the outranking propositions 
X ="a; > bp” and Y = ”bpn > a,” are then given by 


Bel(X) = c(ai, bn) 
hie = c(bh, ai) 3) 
PI(X) = 1 — d(ai, bn) = clai, bn) + u(ai, br) 
ang ee = 1—d(bn, ai) = c(bn, ai) + u(br, ai) me 





C. SET-Step 3: Probabilized outranking 


We have seen in SET-Step 2 that the outrankings X = 
"a; > bp” and Y =” bp > a,” can be characterized by their 
imprecise probabilities P(X) € [Bel(X); PI(X)] and P(Y) € 
[Bel(Y); PI(Y)]. Figure 3 shows an example with P(X) € 
[0.2; 0.8] and P(Y) € [0.1; 0.5] 





Bel(X) P(X) 
| |S PX) 
0 0.2 0.8 1 
Bel(Y) PUY) 
|_ — P(Y) 
0 0.1 0.5 1 


Figure 3: Imprecise probabilities of outrankings. 


Solving the outranking problem consists in choosing (de- 
ciding) if fnally X dominates Y (in such case we must 
decide X as being the valid outranking), or if Y dominates 
X (in such case we decide Y as being the valid outrank- 
ing). Unfortunately, such hard (binary) assignment cannot 
be done in general” because it must be drawn from the 
unknown probabilities P(X) in [Bel(X);PI(X)] and P(Y) 


Tbut in cases where the bounds of probabilities P(X) and P(Y) do not 
overlap. 
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in [Bel(Y);PI(Y)] where a partial overlapping is possible 
between intervals [Bel(X);P1(X)] and [Bel(Y);PI(Y)] (see 
Fig. 3). A soft (probabilized) outranking solution is possible 
by computing the probability that X dominates Y (or that Y 
dominates X) by assuming uniform distribution of unknown 
probabilities between their lower and upper bounds. To get the 
probabilized outrankings, we just need to compute Py>y £ 
P(P(X) > P(Y)) and Pysx £ P(P(Y) > P(X)) which 
are precisely computable by the ratio of two polygonal areas, 
or can be estimated using sampling techniques. 


PY 

w 2 

(e A 

N 

VA 

$ 
A(Y) = 0.045 '/ A(X) =0.195 
PpO > P(X) /  P(X)>P(Y) 











Figure 4: Probabilization of outranking. 


Px>y = A(X)/(A(X) + A(Y)) 


Prox =A axar 


More precisely i 
where A(X) is the partial area of the rectangle A = U(X) x 
U (Y ) under the line P(X) = P(Y) (yellow area in Fig. 4) and 
A(Y) is the area of the rectangle A = U(X) x U(Y) above 
the line P(X) = P(Y) (orange area in Fig. 4). Of course, 
A= A(X)+A(Y) and Pysy = 1— Py>x. As a fnal result 
for the example of Fig. 3, and according to (25) and Fig. 4, 
we fnally get the following probabilized outrankings: 


ai > bn with probabilityPx>y = 0.195/0.24 = 0.8125 
bn > a; with probability Py >x = 0.045/0.24 = 0.1825 


For notation convenience, we denote the probabilities of 
outrankings as Pip £ Pxsy with X = "a; > bp” and Y = 
"bp, > a;”. Reciprocally, we denote P,; = Pysx = 1 — Pin. 


D. SET-Step 4: Soft assignment procedure 


From the probabilized outrankings obtained in SET-Step 
3, we are now able to make directly the soft assignment 
of alternatives a; to categories Cp defned by their prof les 
by. This is easily obtained by the combinatorics of all 
possible sequences of outrankings taking into account their 
probabilities. Moreover, this soft assignment mechanism 
provides also the probability 6; 4 P(a; —> Ø) refecting 
the impossibility to make a coherent outranking. Our soft 
assignment procedure doesn’t require arbitrary choice of 


attitude contrariwise to what is proposed in the classical 
ET method. For simplicity, we present the soft assignment 
procedure in the example 2 below, which can be adapted to 
any number np > 2 of categories. 


Example 2: Let’s consider one alternative a; to be assigned 
to categories C1, C2 and C3 based on multiple criteria (taking 
into account indifference, preference and veto conditions) and 
intermediate prof les bı and b2. Because bo and bg are the min 
and max prof les, one has always P(Xi9 = "a; > bo”) = 1 
and P(Xi3 = ”a; > b3”) = 0. Let’s assume that at the SET- 
Step 3 one gets the following soft outranking probabilities Pin 
as given in Table II. 


Prof les b, > by b2 | b3 
Outranking probas | 
PoP OT 02 | OO | 
Table II: Soft outranking probabilities. 


From combinatorics, only the following outranking se- 
quences S;(a;), k = 1,2,3,4 can occur with non null 
probabilities P(.S;,(a;)) as listed in Table III, where P(S% (a;)) 


Prof les bh =} by bo b3 P Sr ay 
Foran sequences 4 | | | | | 
Si (ai) Ss es |< 
> > < < 


1(ađi 
S2(a;) 
S3 (ai) > < < < 
Sa(ai) >|[<|>I1< 





Table III: Probabilities of outranking sequences. 


have been computed by the product of the probability of each 
outranking involved in the sequence, that is: 


P(Si(a;)) = 1 x 0.7 x 0.2 x 1 = 0.14 

P(S2(a;)) = 1 x 0.7 x (1 — 0.2) x 1 = 0.56 
P(S3(ai)) = 1 x (1 — 0.7) x (1 — 0.2) x 1 = 0.24 
P(S4(a;)) = 1 x (1 — 0.7) x 0.2 x 1 = 0.06 


The assignment of a; into a category Cp delimited by bounds 
bh—ı and ba depends on the occurrence of the outranking 
sequences. Given Sı (a;) with probability P(S1(a;)) = 0.14, 
a; must be assigned to C3 because a; outranks bo, bı and 
bg; Given S2(a;) with probability 0.56, a; must be assigned 
to C2 because a; outranks only bo and bı; Given S3(a;) 
with probability 0.24, a; must be assigned to Cı because 
a; outranks only bo. Given $4(a;) with probability 0.06, 
a; cannot be reasonably assigned to categories because of 
inherent inconsistency of the outranking sequence $4(a;) since 
a; cannot outperform bə and simultaneously underperform 
bı because by profle ordering one has bə > bı. Therefore 
the inconsistency indicator is given by 6; = P(a; > 9) = 
P(S4(a;)) = 0.06. Finally, the soft assignment probabilities 
P(a; — Ch) and the inconsistency indicator obtained by SET- 
Step 4 are given in Table IV. 
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Categories Cn > Cı C2 C3 ) 
Assignment probas a; | 





PP Ch) = 02 | 056 | OF | 8 = 0106 
Table IV: SET Soft Assignment result. 


IV. APPLICATION EXAMPLE : ENVIRONMENTAL CONTEXT 


In this section, we compare ET and SET methods applied 
to an assignment problem related to an environmental context 
proposed originally in [8]. It corresponds to the choice of the 
location of an urban waste resource recovery disposal which 
aims to re-use the recyclable part of urban waste produced by 
several communities. Indeed, this disposal must collect at least 
20000m° of urban waste per year to be economically viable. 
It must be a collective unit and the best possible location 
has to be identifed. Each community will have to bring its 
urban waste production to the disposal: the transport costs are 
valuated in tons by kilometer per year (t.km/year). Building 
such a disposal is generally not easily accepted by popula- 
tion, particularly when the environmental inconveniences are 
already high. This initial environmental status is measured by 
a specif c criterion. Building an urban waste disposal implies 
to use a wide area that could be used for other activities such 
as a sport terrain, touristic equipments, a natural zone, etc. 
This competition with other activities is measured by a specif c 
criterion. 


A. Alternatives, criteria and prof les def nition 


In our example, 7 possible locations (alternatives/choices) 
ai, i = 1,2,...,7, for urban waste resource recovery disposal 
are compared according to the following 5 criteria gj, j = 
1323223403 


gı = Terrain price (decreasing preference); 

g2 = Transport costs (decreasing preference); 

g3 = Environment status (increasing preference); 
4 = Impacted population (increasing preference); 


gs = Competition activities (increasing preference). 


e Price of terrain (g1) is expressed in €/m? with decreasing 
preferences (the lower is the price, the higher is the 
preference); 

e Transport costs (g2) are expressed in t.km/year with 
decreasing preferences (the lower is the cost, the higher 
is the preference); 

e The environment status (g3) corresponds to the initial en- 
vironmental inconvenience level expressed by population 
with an increasing direction of preferences. The higher is 
the environment status, the lower are the initial environ- 
mental inconveniences. It is rated with an integer between 
0 and 10 (highest environment status corresponding to the 
lowest initial environmental inconveniences); 

e Impacted population (g4) is an integrated criterion to 
measure negative effects based on subjective and qual- 
itative criteria. It corresponds to the status of the envi- 
ronment with an increasing direction of preferences. The 
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higher is the evaluation, the lower are the negative effects. 
It is rated with an real number between 0 (great number 
of impacted people) and 10 (very few people impacted); 

e Activities competition (g5) is an integrated criterion, 
evaluated by a real number, that measures the competition 
level between activities with an increasing direction of 
preferences. The higher is the evaluation, the lower is the 
competition with other activities on the planned location 
(tourism, sport, natural environment ...). 


The evaluations of the 7 alternatives are summarized in 
Table V, and he alternatives (possible locations) are compared 
to the 2 decision profles bı and bə described in Table VI. 
The weights, indifference, preference and veto thresholds for 
criteria g; are described in Table VII. 


Criteria 95 > 
Choices a; | 


91 92 
(€/m?) (t - km /year) 
—120 28 





(b) Choices a; and criteria g3, g4 and g5. 


Table V: Inputs of ET (7 alternatives according to 5 criteria). 


Thresholds — 
Criteria g; | ele P E g kA 


1:€/m 
: t - km /year 
:{0,1,..., 10} 


15 40 100 
= a Se 


Table VII: Thresholds. 





B. Results of classical ELECTRE TRI 


After applying ET-Steps 1 and 3 of the classical ET method 
described in Section II with a \ = 0.75 for the A-cut strategy, 
one gets the outranking relations listed in Table VIII. 

The fnal hard assignments obtained by ET method using 
the pessimistic and optimistic attitudes are listed in Table IX. 


C. Results of the new Soft ELECTRE TRI 


After applying SET-Steps 1 and 3 of the SET method® 
described in Section HI, one gets the probabilities of soft 
outrankings listed in Table X. 


8 We have used here the PCRS fusion rule with importance discounting [16], 
and a sampling technique to compute the probabilities Pp. 
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Table VIII: Outranking relations obtained with ET (A = 0.75). 





(a) Pessimistic attitude. 


Table IX: Hard assignments obtained with ET (A = 0.75). 


Prof les bh =} by b2 b3 
Outram propast | | > | e e 
Pin 1 | 0.9858 | 0.6211 0 
0.8908 
0.9999 
1.0000 
0.2142 


0.9996 
0.9975 


(b) Optimistic attitude. 


0.1812 
0.0570 
0.0807 
0.0145 
0.0006 
0.0106 





Table X: Probabilities of soft outranking relations by SET. 


The fnal soft assignments obtained by the SET method are 
listed in Table XI. 


ôs = 0.0114 
ôs = 0 
67 = 0 





Table XI: SET Soft Assignment matrix [P(a; > C),)]. 


D. Discussion 


From Table XI, we can get a hard assignment solution 
(if needed) by assigning each alternative to the category 
corresponding to the maximum of P(a; > Ch), h = 1,2,.... 
With SET, it is also theoretically possible to ”assign” a; to 
none category if ð; (inconsistency level) is too high. The soft 
assignments for a;, i = 3,...7 (see Tables IX,XI) are com- 
patible with the hard assignments with the pessimistic or the 
optimistic attitudes. In fact, only the soft assignments for a1 
and az having the highest probabilities P(a; > C3) = 0.6123 
and P(ag — C2) = 0.7294 appear incompatible with ET 
hard assignments (pessimistic or optimistic). The discrepancy 


between these soft and hard assignments solutions is not due to 
SET method but comes from the arbitrary choice of the level 
of the A-cut strategy used in ET method. Another arbitrary 
choice of A-cut will generate different ET hard assignments 
which can in fact become fully compatible with SET soft 
assignments. For example, if one takes A = 0.5, it can be 
verif ed that SET soft assignments are now compatible with 
ET hard assignments for all alternatives in this example. The 
soft assignments approach of SET is interesting since it doesn’t 
depend on A values even if the infuence of both sigmoids 
parameters def nition, choice of fusion rule, probabilisation 
method ... could be further studied. 


V. CONCLUSIONS 


A new outranking sorting method, called Soft ELECTRE 
TRI (SET), inspired from the classical ELECTRE TRI and 
based on beliefs functions and advanced fusion techniques is 
proposed. SET method uses the same inputs as ET (same 
criteria and thresholds defnitions) but in a more effective 
way and provides a soft (probabilized) assignment solution. 
SET eliminates the inherent problem of classical ET due 
to the arbitrary choice of a A-cut strategy which forces to 
adopt either a pessimistic or optimistic attitude for the fnal 
hard assignment of alternatives to categories. The interest 
of SET over ET method is demonstrated on a preexisting 
environmental context scenario. 
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On The Validity of Dempster-Shafer Theory 


Jean Dezert 
Pei Wang 
Albena Tchamova 


Abstract—We challenge the validity of Dempster-Shafer The- 
ory by using an emblematic example to show that DS rule 
produces counter-intuitive result. Further analysis reveals that 
the result comes from a understanding of evidence pooling which 
goes against the common expectation of this process. Although 
DS theory has attracted some interest of the scientific community 
working in information fusion and artificial intelligence, its 
validity to solve practical problems is problematic, because it is 
not applicable to evidences combination in general, but only to 
a certain type situations which still need to be clearly identified. 


Keywords: Dempster-Shafer Theory, DST, Mathematical 
Theory of Evidence, belief functions. 


I. INTRODUCTION 


Dempster-Shafer Theory (DST), also known as the Theory 
of Evidence or the Theory of Belief Functions, was introduced 
by Shafer in 1976 [1], based on Dempster’s previous works 
[2]-[4]. This theory offers an elegant theoretical framework for 
modeling uncertainty, and provides a method for combining 
distinct bodies of evidence collected from different sources. 
In the past more than three decades, DST has been used 
in many applications, in fields including information fusion, 
pattern recognition, and decision making [5]. 

Even so, starting from Zadeh’s criticism [6]—[8], many 
questions have arisen about the validity and the consistency 
of DST when combining uncertain and conflicting evidences 
expressed as basic belief assignments (bba’s). Beside Zadeh’s 
example, there have been several detailed analysis on this 
topic by Lemmer [9], Voorbraak [10] and Wang [11]. Other 
authors like Pearl [12], [13] and Walley [14], and more 
recently Gelman [15], have also warned the “belief function 
community” about the validity of Dempster-Shafer’s rule (DS 
rule for short) for combining distinct pieces of evidences 
based on different analyses and contexts. Since the mid-1990’s, 
many researchers and engineers working with belief functions 
in applications have observed and recognized that DS rule 
is problematic for evidence combination, specially when the 
sources of evidence are high conflicting. 

In response to this challenge, various attempts have been 
made to circumvent the counter-intuitive behavior of DS 
rule. They either replace Dempster-Shafer’s rule by alternative 
rules, listed for example in [16] (Vol. 1), or apply novel 
semantic interpretations to the functions [16]-[18]. 


Originally published as Dezert J., Wang P., Tchamova A., On 
The Validity of Dempster-Shafer Theory, in Proc. of Fusion 2012, 
Singapore, July 2012, and reprinted with permission. 


Before going further in our discussion, let us recall two of 
Shafer’s statements about DST: 


The burden of our theory is that this rule [Dempster’s 
rule of combination] corresponds to the pooling of 
evidence: if the belief functions being combined are 
based on entirely distinct bodies of evidence and the 
set O discerns the relevant interaction between those 
bodies of evidence, then the orthogonal sum gives 
degree of belief that are appropriate on the basis of 
combined evidence. [1] (p. 6) 

This formalism [whereby propositions are repre- 
sented as subsets of a given set] is most easily 
introduced in the case where we are concerned with 
the true value of some quantity. If we denote the 
quantity by 0 and the set of its possible values by O, 
then the propositions of interest are precisely those 
of the form “The true value of @ is in T; where T 
is a subset of O. [1] (p. 36) 


These two statements are very important since they are 
related to two fundamental questions on DST that are central 
in this discussion on the validity of DS theory: 

1) What is the meaning of “pooling of evidence” used by 

Shafer? Does it correspond to an experimental protocol? 

2) When “the true value of @ is in T” is asserted by 

a source of evidence, are we getting absolute truth 
(based on the whole knowledge accessible by everyone 
eventually) or relative truth (based on the partial 
knowledge accessible by the source at the moment)? 


This paper starts with a very emblematic example to show 
what we consider as really problematic in DS rule behavior, 
which corresponds to the possible “dictatorial power” of a 
source of evidence with respect to all others and thus reflecting 
the minority opinion. We demonstrate that the problem is in 
fact not merely due to the level of conflict between sources 
to combine, but comes from the underlying interpretations of 
evidence and degree of belief on which the combination rule 
is based. Such interpretations do not agree with the common 
usage of those notions where an opinion based on certain 
evidence can be revised by (informative) evidence from other 
sources. 
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This work is based on our preliminary ideas presented in the 
Spring School on Belief Functions Theory and Applications 
(BFTA) in April 2011 [19], and on many fruitful discussions 
with colleagues using belief functions. Their stimulating com- 
ments, especially when they disagree, help us to clarify and 
present our ideas.! In Section II we briefly recall basics of DST 
and DS rule. In Section III, we describe the example and its 
strange (counter-intuitive) result. In Section IV we present a 
general analysis on the validity of DST, and we conclude our 
analysis in Section V. 


II. BASICS OF DST 


Let © = {0),62,...,0,} be a frame of discernment of a 
problem under consideration containing n distinct elements 6;, 
t=1,...,n. 

A basic belief assignment (bba, also called a belief mass 
function) m(.) : 2° — [0,1] is a mapping from the power 
set of © (i.e. the set of subsets of ©), denoted 2°, to [0,1], 
that must satisfy the following conditions: 1) m(Ø) = 0, 
i.e. the mass of empty set (impossible event) is zero; 2) 
X xez20 M(X) = 1, i.e. the mass of belief is normalized to 
one. Here m(X) represents the mass of belief exactly commit- 
ted to X. An element X € 2° is called a focal element if and 
only if m(X) > 0. The set F(m) = {X € 2°|m(X) > 0} 
of all focal elements of a bba m/(.) is called the core of the 
bba. By definition, a Bayesian bba m/(.) is a bba having only 
focal elements of cardinality 1. The vacuous bba characterizing 
full ignorance is defined by m,(.) : 2° — [0;1] such that 
M,(X) = 0 if X # O, and m,(0) = 1. 

From any bba m(.), the belief function Bel(.) and the plau- 
sibility function Pl(.) are defined as VX € 2° : Bel(X) = 
Dylycx MY) and PUX) = Vyixnyzo MY). Bel(X) 
represents the whole mass of belief that comes from all subsets 
of © included in X. PI(X) represents the whole mass of belief 
that comes from all subsets of © compatible with X (i.e., those 
intersecting X). 

The DS rule of combination [1] is an operation denoted 
®, which corresponds to the normalized conjunction of mass 
functions. Based on Shafer’s description, given two indepen- 
dent and distinct sources of evidences characterized by bba 
my,(.) and m2(.) on the same frame of discernment O, their 
combination is defined by mps(0) = 0, and VX € 2° \ {Ø} 


X 
mps(X) = [mı $ m)(X) = a (1) 
where 
my2(X) = 5 mı(Xı)m2(X2) (2) 
X1,X2E2° 
XıNX2=X 


corresponds to the conjunctive consensus on X between the 
two sources of evidence. K2 is the total degree of conflict 


‘Our presentation is not based on a previous statistical argumentation 
developed in [20], since it appears for some strong proponents of DST 
as an invalid approach to criticize DS rule. In this paper we adopt a 
simpler argumentation based only on common sense and simple considerations 
manipulating witnesses reports. 


between the two sources of evidence defined by 


2 


X1,X2€2° 
X1NX2e=0 


Kız = m12(0) = m1 (X1)m2(X2) (3) 


When K 12 = ™42(@) = 1, the two sources are said in total 
conflict and their combination cannot be applied since DS rule 
(1) is mathematically undefined, because of 0/0 indeterminacy 
[1]. DS rule is commutative and associative, which makes it 
attractive from engineering implementation standpoint, since 
the combinations of sources can be done sequentially instead 
globally and the order doesn’t matter. Moreover, the vacuous 
bba is a neutral element for the DS rule, i.e. [n @ my](.) = 
[my © m](.) = m(.) for any bba m(.) defined on 2°, which 
seems to be an expected? property, i.e. a full ignorant source 
doesn’t impact the fusion result. 

The conditioning of a given bba m/(.) by a conditional 
element Z € 2° \ {Ø} has been also proposed by Shafer 
[1]. This function m(.|Z) is obtained by DS combination of 
m(.) with the bba mz(.) only focused on Z, i.e. such that 
mz(Z) = 1. For any element X of the power set 2° this is 
mathematically expressed by 


m(X|Z) = [m @ mz|(X) = [mz ® m|(X) (4) 


It has been proved [1] (p. 67) that this rule of conditioning 
expressed in terms of plausibility functions yields to the 
formula 


PU(X|Z) = P(X A Z)/PUZ) (5) 


which is very similar to the well-known Bayes formula 
P(X|Z) = P(X 1 Z)/P(Z). Partially because of this, DST 
has been widely considered as a generalization of Bayesian 
inference [3], [4], or equivalently, that probability theory is a 
special case of the Mathematical Theory of Evidence when 
manipulating Bayesian bba’s. 

Despite of the appealing properties of DS rule, its apparent 
similarity with Bayes formula for conditioning, and many 
attempts to justify its foundations, several challenges on the 
theory’s validity have been put forth in the last decades, and 
remain unanswered. For instance, an experimental protocol 
to test DST was proposed by Lemmer in 1985 [9], and 
his analysis shows an inherent paradox (contradiction) of 
DST. Following a different approach, an inconsistency in the 
fundamental postulates of DST was proved by Wang in 1994 
[11]. Some other related works questioning the validity of 
DST based on different argumentations have been listed in 
the introduction of this paper. 

In the following, we identify the origin of the problem of 
DS rule under the common interpretation of the “pooling” of 
evidence, and why it is very risky to use it in very sensitive 
applications, specially where security, defense and safety are 
involved. 


2A detailed discussion about this ”expected” property can be found in [20]. 
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III. A SIMPLE EXAMPLE AND ITS STRANGE RESULT 


To see the problem in combining evidence with DS rule, 
let us analyze an emblematic example. Consider a frame 
of discernment with three elements only, © = {A,B,C}, 
satisfying Shafer’s request, i.e. the elements of the frame are 
truly exhaustive and exclusive. As in Zadeh’s example, we 
interpret the problem as medical diagnosis, where A, B and C 
correspond to three distinct pathologies (say A = brain tumor, 
B = concussion and C = meningitis) of a patient. In such a 
situation, it is reasonable to assume that these pathologies do 
not occur simultaneously, so Shafer’s assumptions truly hold. 

We suppose that two distinct doctors (or more generally, 
two witnesses) provide their own medical diagnostic (or more 
generally, a testimony) of the same patient, based on their own 
knowledges and expertises, after analyzing symptoms, IRM 
images, or any useful medical results. The diagnostics (testi- 
monies) of the two distinct sources of evidences correspond 
to the two non-Bayesian bba’s given by the doctors listed in 
Table I. The parameters a, bı, and bz can take any value, as 
long as a € [0,1], b1, b2 > 0, and bı + bə € [0, 1]. 

















[| Focal elem. \ bba’s | mı(.) mal.) | 
A a 0 
AUB l-a bı 
C 0 1— bı — b2 
AUBUC 0 b2 
Table I 


INPUT BBA’S M1 (.) AND mo(.). 


The two distinct sources are assumed to be truly indepen- 
dent (the diagnostic of Doctor 1 is done independently of 
the diagnostic of Doctor 2 and from different medical results, 
images supports, etc, and conversely) so that we are allowed to 
apply DS rule to combine the two bba’s m; (.) and m2(.). Both 
doctors are also assumed to have the same level of expertise 
and they are equally reliable. Note that in this very simple 
parametric example the focal elements of bba’s are not nested 
(consonant), and there really does exist a conflict between 
the two sources (as it will be shown in the derivations). It 
is worth to note also that the two distinct sources are truly 
informative since none of them corresponds to the vacuous 
belief assignment representing a full ignorant source, so it is 
reasonable to expect for both bba’s to be taken into account 
(and to have an impact) in the fusion process. Here we use 
the notion of “conflict” as defined by Shafer in [1] (p. 65) and 
recalled by (3). 

When applying DS rule of combination, one gets: 


1) Using the conjunctive operator: 


my2(A) = a(by + b2) (6) 
m42(A U B) = (1 — a)(b, + b2) (7) 
Ki2g = m12(0) = 1 — bı — be (conflicting mass) 


(8) 
2) and After normalizing by 1 — Kı2 = bı + b2, the final 


result is as follows: 


my2(A) _ a(by + b2) 








A = = 
mps(A) 1— Kiz by + bg 
=a = mı(A) (9) 
a mı2( AU B) = (1 — a)(by + b2) 
ae 1-Ky by + bg 


=1l-a=m,(AUB) (10) 

Surprisingly, after combining the two sources of evidences 
with Dempster-Shafer’s rule, we see that in this case the 
medical diagnostic of Doctor 2 doesn’t count at all, because 
one gets mpgs(.) = mı(.). Though Doctor 2 is not a fully 
ignorant source and he/she has same reliability as Doctor 
1, nevertheless his/her report (whatever it is when changing 
values of bı and b2) doesn’t count. We see that the level of 
conflict Ky. = 1— bı —bz between the two medical diagnostics 
doesn’t matter in fact in the DS fusion process, since it can be 
chosen at any high or low level, depending on the choice of 
bı + bg. Based on DST analysis, the Doctor 2 plays the same 
role as a vacuous/ignorant source of evidence even if he/she 
is informative (not vacuous), and truly conflicting (according 
to Shafer’s definition) with Doctor 1. 

This result goes against common sense. It casts serious 
doubt on the validity of DS rule, as well as its usefulness 
for applications, and interrogates on the real meaning of 
Shafer’s pooling of evidence process. This example seems 
more crucial than the examples discussed in the existing 
literature in showing intolerable flaws in DST behavior, since 
in this example the level of conflict (whatever it is) between the 
sources doesn’t play a role at all, so that it cannot be argued 
that in such a case DS must not be applied because of the 
high conflicting situation. In fact such a situation can occur in 
real applications and is not anecdotal, and the results obtained 
by DS rule can yield dramatical consequences. From Zadeh’s 
example [6] and all the debates about it in the literature, it 
has been widely (though not completely) admitted that DS is 
not recommended when the conflict between sources is high. 
Our example brings out a more important question since it 
reveals that the problem of the behavior of DS rule is not due 
to the (high) level of conflict between the sources, but from 
something else — we can choose a low conflict level, but the 
result is still the same, so the problem remains. 

We can see the situation better by generalizing from this 
example. What make this example special and emblematic of 
DS behavior is the fact that Pl; (C) = 0. It not only means that 
Doctor 1 completely rules out the possibility of C’, but also that 
this opinion cannot be changed by taking new evidence into 
consideration. This is the case, because according to Shafer’s 
definition [1] (p. 43), Pl,(C) = 0 means for every X € 2° 
that X NC 4 Ø, mı(X) = 0. When DS rule is applied to 
combine mı(.) and an arbitrary m/(.), for every Y € 2° 
that Y NC 4 0, mps(Y) = 0, because it is the sum of 
some products, each of them take one of the above m,(X) 
as a factor. Consequently, Plps(C) = 0, no matter what the 
other body of evidence is. Actually in such situations DS rule 
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doesn’t perform a fusion between sources’ opinions, but an 
exclusion, ruling out the conflicting hypothesis considered by 
the second source. 

Put it in another way, the effective frame of discernment 
of Doctor 1 is not really {A,B,C}, but {A, B}, because the 
pathology C has been ruled out of the frame by Doctor 1, 
since the focal elements of mı(.) are A and AU B only. 
The above analysis tells us that when different supports (i.e. 
sets of focal elements) are combined according to DS rule, 
the resulting bba will be defined in the intersection of the 
supports of each source, under the condition that it is not 
empty (otherwise the evidence is total conflicting, and the rule 
is not applicable). Furthermore, all of the original bba will be 
normalized on this common support before being combined. 
This is the very fundamental principle on which is based DST 
and the combination of evidence proposed by Shafer. 

More precisely in our example, the adjusted bba m5(.) of 
Doctor 2 is described in Tables II- III and IV. 























Focal elem. \ bbas | m1 (.) m2(.) 
A a 0 
AUB l-a by 
C=0 0 1- bı = b2 
AUBUÎ=AUB 0 bo 
Table II 


STEP 1 OF ADJUSTMENT OF m2(.). 

















| Focal elem. \ bba’s | m1(.) m2(.) | 
A a 0 
AUB l-a by + b2 
C= 0 1— bı — b2 
Table III 


STEP 2 OF ADJUSTMENT OF m2(.). 























[ Focal elem. \ bba’s | ™1(.) m3 (.) 
A a 0 
by +b E 
AUB EEE 
Table TV 


ADJUSTED AND NORMALIZED BBA’ S mı (.) AND m4 (.). 


After this adjustment, the bba m%(.) of Doctor 2 becomes 
the vacuous bba, which has no impact to the result. This 
perfectly explains the result produced by DS rule, but doesn’t 
suffice to fully justify its real usefulness for applications. 

In general, given two frames of discernment to be combined, 
if one is a proper subset of the other, the result is asymmetric 
— the smaller frame always wins the competition, though the 
other one does not always become vacuous. 

Again, here we see that the result is not from any specialty 
of our emblematic example, but directly from conjunctive 
nature of the DS rule. As Shafer wrote: “A basic idea of the 
theory of belief functions is the idea of evidence whose only 
direct effect on the frame © is to support a subset Ai, and 
an implicit aspect of this idea is that when this evidence is 
combined with further evidence whose only direct effect on 
© is to establish a compatible subset Ay, the support for A; 
is inherited by Aı N Ag.” [21] 


Now the fundamental question becomes: should evidence 
combination be treated in this way? 


IV. EVALUATING THE VALIDITY OF DST 


After sharing the above result we found with other re- 
searchers in the field, we got three types of response, which 
can be roughly categorized as: 


1) This result does not show that DST is wrong, but that 
there are situations where it is not applicable. This 
example contains conflicting evidence, so DST should 
not be applied. 

2) This result does not show that DST is wrong, and this 
result is exactly the correct one. It is your intuition that 
is wrong. 

3) This result shows that DST is wrong, since it is un- 
reasonable to let one expert’s opinion to completely 
suppress the other opinions. 


The first response is not very satisfactory because it tells 
us that DST should not be applied when evidences conflict. If 
we admit such a response, what is the real purpose in using 
DS rule in practical applications using belief functions, since 
most of them do involve conflicting sources? In agreeing with 
the first response, we see that DS rule reduces to the strict 
conjunctive rule which should be used only in limited cases 
where there is no conflict between sources. It is not obvious 
to see why the conjunctive rule even in these cases is well- 
adapted for the pooling of evidence. In fact, in the context 
on no conflicting sources, the conjunctive rule corresponds 
just to the selection of the most specific source, rather than a 
combination (pooling) of evidences. 

Each of the two last responses is supported by a long 
argument, which sounds reasonable until they are put together 
— how can we have such different opinions on such a simple 
example? Can DST be used to combine them to provide a 
final conclusion based on the pooled evidence? 

Instead of trying to apply DS rule (if possible) or to analyze 
the above responses one by one, we will temporarily step back 
from this concrete case, and discuss a meta-level problem first, 
that is, when a mathematical theory is applied to a practical 
situation, how to decide the validity of this application? In 
what sense the result is “right” or “wrong”? 

Of course, there are some trivial cases where the solution 
is obvious. If the result is deterministic and there is an 
objective way to check it, then the conclusion is conclusive. 
Unfortunately, in the field of uncertain reasoning, it is not 
that simple. In the above example, we cannot use the disease 
the patient has (assume we finally become certain about it) 
to decide whether DST is correctly applied to it, though it 
may influence our degree of belief about the theory. Actually, 
this is exactly how “evidence” is different from “proof” in 
deciding the truthfulness of a conclusion — while a proof can 
determine the truth-value of a statement conclusively, evidence 
can only do so tentatively, because in realistic situations there 
is always further evidence to come. 

Another relatively simple situation is that an internal in- 
consistency is found in the mathematical theory. In that case 
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the theory is clearly “wrong”, and is not good for any 
normal usage. This is not the case here, neither. There are 
inconsistencies founded about DST, such as [11], but it is 
between the theory and its semantic interpretations (that is, 
between what it is claimed to do and what it actually does), 
rather than within the (uninterpreted) mathematical structure 
of the theory. 

What we are facing is a more complicated situation, where 
the result produced by a theory “sounds wrong”, that is, it 
conflicts with our intuition, experience, or belief. DST is not 
the only theory that has run into this kind of trouble, and 
there are indeed three logical possibilities, as represented by 
the responses listed previously. What to make the situation 
more complicated is the existence of two types of researchers, 
with very different motivations in this context: 

e A: There are people who start with a domain problem, 
which is called “belief revision”, “evidential reasoning”, 
“data fusion”, and so on, by different researchers. They 
are looking for a mathematical tool for this job. 

e B: There are people who start with a mathematical model 
that has some properties they like, DST in this case, and 
are looking for proper practical applications for it. 

In general, both motivations are legitimate, but it is crucial 
that they should not be confused with each other. We belong to 
Type A, and are evaluating DST with respect to the problem 
we have in mind, to which DST is often claimed to be a 
solution. For this reason, we argue that DST failed to do the 
job. Some objection to our conclusion comes from people of 
Type B, to them DST can be called “wrong” only when an 
internal inconsistency is found, otherwise the theory is always 
correct, and all mistakes are cased by its human users. Here 
we are not criticizing DST in that sense. Using the above 
example, we conclude DST to be “wrong” because it fails to 
properly handle evidence combination, or in other words, what 
it claims to do does not match what it actually does, as the 
defect proved in [11]. 

To support our conclusion with evidence (rather than with 
intuition), we start from an analysis of the task of “evidence 
combination” (or call it “data fusion”). As mentioned above, 
“evidence” has an impact on “degree of belief” in a system 
doing evidential reasoning, like “proof” has on “truth-value” 
in a system using classical logic, except here the impact is 
tentative and inconclusive (i.e. it doesn’t provide an absolute 
truth). This is exactly why evidence combination becomes nec- 
essary (while there is no corresponding operation in classical 
logic) — in a system that is open to new evidence, it needs 
to use new evidence to adjust its degree of belief, and the 
“rule” here should be similar to the rule used to merge the 
opinions of different experts. In both cases, each opinion has 
some evidential support, though none of them can be treated 
as absolutely certain. 

This is according to the above understanding of “evidence 
combination” that DST’s result in the above example is con- 
sidered as “wrong”, simple because it allows certain opinion 
to become immune to revision. To be concrete, what if the 
previous example consists of 100 doctors, and all of them, 


except Doctor 1, consider C the most likely disease the patient 
has, though they cannot completely rule out the possibility of 
A and B. On the other hand, Doctor 1, for some unspecified 
reason, considers C impossible, and A more likely than B. In 
this case, DST will still completely accept Doctor 1’s opinion, 
and ignore the judgment of the other 99 experts. We don’t 
believe anyone will consider this judgment reasonable. 

Based on conjunction, DS rule supports the dictatorial 
power of a source, by accepting the minority opinion as 
effective solution for “pooling” evidences, no matter that the 
general a priori assumption applying DS rule is all sources 
of information are equally reliable, which means all sources’ 
opinions should be taken into account on equal terms. From 
a theoretical point of view, we don’t think this type of belief 
should be allowed in evidential reasoning; from a practical 
point of view, such a treatment can lead to serious conse- 
quences, since it means that some errors in one evidence 
channel cannot be corrected by other channels, no matter how 
many and how strong. 

To us, the only possible way to justify DST in similar situ- 
ations is to change what we mean by “evidence combination”. 
According to Shafer’s treatment, “evidence combination” be- 
comes a process similar to constraint satisfaction, where each 
piece of evidence put some absolute restriction on where 
the final result can be, and their combination corresponds to 
“to reach a consensus by mutual constraining”. According 
to this interpretation, Doctor 1 has the right to suppress all 
the other opinions and therefore can dictates his opinion. 
If we want to consider each doctor’s opinion as absolute 
truth (following Shafer’s interpretation), though sometimes 
underspecified, then the result becomes acceptable. But in 
this case, the validity and usefulness of DS rule is strongly 
conditioned by the justification of the fact that each doctor 
does really have access to the absolute truth on the proposition 
under consideration. How can this be done in practice? From 
what knowledge can a doctor get an absolute truth on a 
proposition? The answers to these very important questions 
for validating DS rule haven’t been given in the literature so 
far (to the authors knowledge). 

Furthermore, if every doctor is allowed to claim this kind 
of absolute truth, there is nothing preventing different doctors 
from announcing different “truths”, which leads to “total 
conflict” situation that cannot be resolved by Dempster’s rule. 
Therefore, the theory faces a paradox: it must either ban the 
claim of any unrevisable belief, or find a way to handle the 
conflict among such beliefs. To accept unrevisable beliefs only 
from a single source does not sound reasonable. 

The difference between the two interpretations of “evidence 
combination” are semantic and philosophical. According to 
our interpretation, when there are competing opinions sup- 
ported by distinct evidences, none of them has “absolute 
truth”, but each has some “relative truth”, with respect to 
the supporting evidence, so in the combination process all 
the opinions can be more or less revised, and the result is 
usually a compromise; According to Shafer’s interpretation, if 
one source considers an element in the frame of discernment 
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as impossible, this judgment will be taken as absolute truth, 
and is therefore unrevisable by the other opinions. 

Though it is possible to imagine certain situations, such 
as Shafer’s “random coding” scenario [21], where DST can 
produce reasonable results, we believe our interpretation of 
“evidence combination” better matches the common sense 
meaning of the phrase, as well as the most practical needs 
in this domain. 

It is true that every mathematical theory has its limited 
applicable domain, and we are not demanding DST to be 
“universal”. However, here the situation is that DST is often 
presented as a general mechanism for evidential reasoning. 
Even though it has been widely acknowledged in the com- 
munity that DST cannot properly handle (highly) conflicting 
evidence, its cause has not been clearly analyzed, nor is 
the applicable situations of the theory clearly specified. The 
above analysis answers these questions: conflicting evidence 
(whatever they are, in high or in low conflict) cannot be 
handled well by DST, since they cannot be seen as “partial 
truth” anymore. 

The last important point to underline is the about DS condi- 
tioning rule (4) and the formula (5) for conditional plausibility. 
Let consider © and two bba’s mj,(.) and ma(.) defined on 
2© and their DS combination mps(.) = [m1 © mg] (.) and 
let assume a conditioning element Z +Æ @ in 2° and the 
bba mz(Z) = 1, then mps(.|Z) = [mps ® mz|(.) = 
[mi B® mz 6 mz|(.). Because mps(.) = [mi $ mə|(.) is 
inconsistent with the probability calculus [10], [11], [14], [15], 
[20], then mpg(.|Z) is also inconsistent. Therefore for any 
X in 2°, the conditional plausibility P/(X|Z) expressed by 
PUX|Z) = PI(X A Z)/PI(Z) (with apparent similarity with 
Bayes formula) obtained from mpg(.|Z) is not compatible 
with the conditional probability as soon as several sources of 
evidences are involved. 


V. CONCLUSIONS 


In this paper, through a very simple example, we have 
shown and explained what we consider as a very serious flaw 
of DS reasoning, which has generated strong controversies in 
the last three decades. The problem is: given the mathematical 
property of the combination rule, in certain situation the 
judgment expressed by a single information source will be 
effectively treated as absolute truth that will dominate the 
final result, no matter what judgments the other sources have. 
Such a result is in total disagreement with the common-sense 
notion of “evidence combination”, “information fusing”, or 
whatever the process is called, because in such a process, each 
information or evidence source should always be considered 
only as having local or relative truth. In summary, we believe 
DST has been often and widely used in situations where it 
should not, and such applications are wrong. After several 
decades of existence, proponents of DST need to clearly 
identify the situations where its model may be truly applicable 
and what real experimental “pooling” of evidence process DS 
rule corresponds to. This question is not what this paper is 
discussing, but is left for future research and discussions. 
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Abstract. Dempster’s rule of combination is commonly used in the fiel of infor- 
mation fusion when dealing with belief functions. However, it generally requires a 
high computational cost. To reduce it, a basic belief assignment (bba) 
approxima-tion is needed. In this paper we present a new bba approximation 
approach called hierarchical proportional redistribution (HPR) allowing to 
approximate a bba at any given level of non-specificity Two examples are given 
to show how our new HPR works. 


1 Introduction 


Dempster-Shafer Theory (DST), also called Theory of Evidence [10], has been 
widely used in many applications, e.g., information fusion, pattern recognition and 
decision making [11]. Although it is appealing in uncertainty modeling, while ap- 
pearing more controversial for consistent reasoning, the high computational cost 
remains problematic which is often raised against its use [11]. To resolve such a 
problem, three major types of approaches have been proposed. 

The frst is to propose eff cient procedures for performing exact computations 
[1, 8]. The second is composed of Monte-Carlo techniques [9]. The third is to 


approximate a belief function to a simpler one. The papers of Voorbraak [13], 
Dubois and Prade [5] are seminal works of this type. Other representative works 
include &—/—x [3] and k-additive belief function [2, 6]. Denceux uses hierarchical 
clustering to implement the inner and outer approximation [3]. 
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In this paper, we propose a new method called hierarchical proportional redistri- 
bution (HPR) to approximate any general basic belief assignment (bba) at a given 
level of non-specificit [4], up to the ultimate level 1 corresponding to a Bayesian 
bba [10]. The level of non-specif city can be controlled by the users through the 
adjustment of the maximum cardinality of remaining focal elements. For the ap- 
proximated bba obtained by HPR, the maximum cardinality of the focal elements 
is k. Thus HPR can be considered as a generalized k-additive belief approximation. 
Some examples are given to show how our proposed HPR method works, and to 
compare it with other approximations. 


2 Basics of Dempster-Shafer Theory (DST) 


In DST [10], the frame of discernment (FoD) is a set © of mutual exhaustive and 
exclusive elements. m(.) : 2° — [0,1] is a basic belief assignment (bba), also called 
mass function, if it satisfie 


Duco (A) = 1, m@) =0. (1) 
Belief function (Be/) and plausibility function (P/) are define as 
Bel(A) = inc ym(B) and PI(A) = Dunszo™(B). (2) 


Suppose that mı,m2,...,mn are n bba’s, then Dempster’s rule of combination is de- 
fi ed by 
0, 4=0 
A=? pha? ae (3) 
> LL mA)’ a 


4; #0 1<i<n 

This rule is used in DST to combine pieces of evidence expressed by bba’s. As re- 
ferred above, Dempster’s combination has high computational cost and three types 
of approaches have been proposed to reduce it. We prefer belief approximation ap- 
proaches [2, 3, 6, 12] since they both reduce the computational cost of the combina- 
tion and allow to deal with smaller-size focal elements, which is more intuitive for 
human to catch the meaning and interpret fusion results [2]. 


3 Two bba Approximation Approaches 


1) k—/—x approximation: This was proposed by Tessem [12]. The simplif ed bba 
obtained by k—/— x approach satisfie : a) keep no less than k focal elements; b) keep 
no more than / focal elements; c) the mass assignment to be deleted is no greater 
than x. In k—/—x, the focal elements of a original bba are sorted by their masses. 
Such an algorithm chooses the frst p focal elements such that k < p </ and such 
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that the sum of the masses of these fi st p focal elements is no less than 1 —x. The 
deleted masses are redistributed to the other focal elements through a normalization. 


2) k-additive belief function approximation: Given m/(.) : 2° — [0,1], one kind of 
k-additive belief function [2, 6] induced by the mass m/(.) is def ned by 


me(B)=m(B)+ © AETV |B| <k 


ADB,ACO,|A|>k (4) 
m,(B) =0, V|B| >k 
where B C © and 
k k 
Tia |A|! 
(|4 k) = = —— 5 
(Al; H 2 GDA © 


is the average cardinality of the subsets of A of size at most k. For k-additive belief 
approximation, the maximum cardinality of available focal elements is no greater 
than k. Other bba approximation methods can be found in related references. 


4 Hierarchical Proportional Redistribution Approximation 


In this paper we propose a new bba approximation approach called hierarchical 
proportional redistribution (HPR), which provides a new way to reduce step-by- 
step the mass committed to uncertainties. Ultimately an approximate measure of 
subjective probability can be obtained if needed, i.e. a so-called Bayesian bba in 
[10]. Our proposed procedure can be stopped at any step in the process and thus it 
allows to reduce the number of focal elements of a given bba in a simple manner to 
diminish the size of the core [10] of a bba. Thus we can reduce the complexity (if 
needed) when applying also some complex rules of combinations. By using HPR, 
we can obtain approximate bba’s at any different non-specif city level that we want. 
Let us frst introduce two new notations for convenience and conciseness: 


1. Any element of cardinality 1 < k < n of the power set 2° will be denoted X(k) 
by convention. For example, if © = {4,B,C}, then X(2) can denote the following 
partial uncertainties AUB, AUC or BUC, and X(3) denotes the total uncertainty 
AUBUC. 

2. The proportional redistribution factor (ratio) of width s involving elements XY and 
Y of the powerset is define by (for X 40 and Y £0) 


m(Y)+e- |X| 


yor m(Y)+e-|X| 
IX|-l¥l=s 


R5(Y,X) ê 





(6) 


where € is a small positive number introduced here to deal with particular cases 


where > hs m(Y) =0. 


By ene bon we will denote R(Y,X) = R\(¥,X) when we use the proportional 
redistribution factors of width s = 1, as we use in this paper for this HPR method. 
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The HPR is a step-by-step (recursive) proportional redistribution of the mass m(X(k)) 
of a given uncertainty X(k) (partial or total) of cardinality 2 < k < n to all the least 
specifi elements of cardinality k—1, i.e., to all possible X(k—1), until k = 2 is 
reached. The proportional redistribution is done from the masses of belief com- 
mitted to X(k— 1) as done classically in DSmP transformation. The “hierarchical” 
masses m,(.) are recursively (backward) computed as follows. Here m, ,) represents 
the approximate bba obtained at the step n—k of HPR, i.e., it has the maximum focal 
element cardinality of k. 


mi(n—1)(X(n—1)) =m(X(n—1)) +E x(njox(n-1, MX) RX n- 1),X(n))]; 
X(n) X(n—1)€28 (7) 
Mp(n—1)(A) =m(A),V|A|<n-1 


Mp(n—1)(-) is the bba obtained at the frst step of HPR (n— (n— 1) = 1), the maximum 
focal element cardinality of mj,_1) is n— 1. 


Mh(n—2) X(n—2)) =m(X(n—2)) 

+L X(n—1)DX(n-2) [mn(n—1) (X(n = 1)) ` R(X(n = 2),X(n = 1))] (8) 
X(n—2),X(n—1)€29 

Mh(n—2) A)= Mh(n—1) (4), WIA] <n—-2 





Mp(n—2)(-) is the bba obtained at the second step of HPR (n— (n— 2) = 2), the maxi- 
mum focal element cardinality of mj,_2) 18 n—2. 

This hierarchical proportional redistribution process can be applied similarly (if 
one wants) to compute mp(„—3)(-), Ma(n—4)(-)> -> Mmao) C), maa) C) With 


my) (X(2)) = m(X(2)) +È xoxo) mig) (X(3)) -R(X(2),X(3))] 
X(3),X(2)€2? 


(9) 
my(2)(A) =m) (A), V/A] <n —2 


mpy(2)(-) is the bba obtained at the frst step of HPR (n—2), the maximum focal 
element cardinality of mjy2) is 2. 

Mathematically, for any X(1) € O, i.e. any 6; € © a Bayesian belief function can 
be obtained by HPR method in deriving all possible steps of proportional redistri- 
butions of partial ignorances in order to get 


ma X0) =m(X(1))+ X oK) R(X), X(2))] (10) 
X(2)DX(1) 
X(1),X(2)e2° 
In fact, mj,1)(-) is a probability transformation, called here the Hierarchical DSmP 
(HDSmP). Since X(n) is unique and corresponds only to the full ignorance 6; U 62 U 
...U6,, the expression of m;,(X(n—1)) in Eq.(9) just simplifie as 


Min) (X (n= 1)) = m (X(n -1))+m(X(n)) RX (n-=1), X(n) AD 


For the full proportional redistribution of the masses ofuncertainties to the elements 
least specifi involved in these uncertainties, no mass is lost during the step-by-step 
hierarchical process and thus at any step of HPR, the sum of masses is kept to one. 
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5 Examples 
5.1 Example 1 


Let’s consider the following bba over © = {6), 02, 03}: 
m(6;) =0.10, m(02)=0.17, m(03) =0.03, m(0,U 62) =0.15, 
m(0;U 63) = 0.20, m(@2U 63) =0.05, m(0ı U @2U 63) = 0.30. 
We apply the HPR with e = 0 in this example because there is no mass of belief 


equal to zero. It can be verif ed that the result obtained with small positive € param- 
eter remains (as expected) numerically very close to what is obtained with € = 0. 


e Step 1: The frst step of HPR consists in redistributing back m(6 U 02 U 63) = 0.30 
committed to the full ignorance to the elements 6, U @, 6; U 03 and @ U 63 only, 
because these elements are the only elements of cardinality 2 that are included in 
0; U &2 U 63. Applying the Eq. (8) with n = 3, one gets when X(2) = 0; U &2, 01 U 03 
and 0; U @, the following masses. 


mpi) (01 U 02) = m( 0; U 02) +m(X(3)) -R(01 U 02,X(3)) = 0.15 + (0.30-0.375) = 0.2625 


Similarly, one gets 


my(o)(81 U 03) = m( 0} U 63) + m(X(3))-R(O; U 83,X(3)) = 0.20 + (0.30-0.5) = 0.35 


because R(01 U 63,X(3)) = grrr = 0-5, and also 


mpi) (02 U 03) = m(02 U 05) +m(X (3))-R(O2U O3,X(3)) = 0.05 + (0.30-0.125) = 0.0875 





e Step 2 Now, we go to the next step of HPR principle and one needs to redistribute 
the masses of partial ignorances X(2) corresponding to 0; U 0, 0; U0; and 0)U 03 
back to the singleton elements X(1) corresponding to 01, 0: and 63. We use Eq. (10) 
for doing this as follows: 


mp1) (01) = m(O1) +4, (81 U 02)-R(01,01 U02) +m,(01 U O3)-R(O1, 01 U 83) 
~ 0.10 + (0.2625 - 0.3703) + (0.35 -0.7692) = 0.10 +0.0972 +0.2692 = 0.4664 


because R( 0), 01 U02) = grey © 0.3703 and R(01, 01 U 03) = mems © 0.7692 
Similarly, one gets 


mp1) (02) = m(Oz) +p (0) U 02) -R(O2, 01 U O2) +m; (02 U 03) -R(O2, 02 U 63) 
=~ 0.10 + (0.2625 -0.6297) + (0.0875 - 0.85) = 0.17 +0.1653 +0.0744 = 0.4097 


because R( 62,0 U 02) = yry © 0.6297 and R(,02U 03) = ymz = 0.85. and 
also 


Mp1) (03) = m( 83) +p (01 U 03) - R(03, 01 U 03) + my (02 U 03) - R(O3, 02 U 03) 
= 0.03 + (0.35 - 0.2307) + (0.0875 -0.15) = 0.03 + 0.0808 + 0.0131 = 0.1239 
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because R(63, 0) U 03) = grelo © 0.2307 and R( 63,0) U 03) = mem = 0.15 
Hence, the result of f nal step of HPR is: 


mp(1)(01) = 0.4664, mpg) (02) = 0.4097, mp) (03) = 0.1239. 


We can easily verify that mj 1)(@1) + mp1) (82) +m) (83) = 1. 

To compare HPR with the approach of  —/—x, we set the parameters of k—/—x 
to obtain bba’s with equal focal element number with HPR at each step. In Example 
1, for HPR at fir t step, it can obtain a bba with 6 focal elements. Thus we set 
k=1=6,x = 0.4 for k—/—x to obtain a bba with 6 focal elements. Similarly, for HPR 
at second step, it can obtain a bba with 3 focal elements. Thus we set k =! =3,x =0.4 
for k—1—x. Based on HPR and k —/ —x, the results are shown in Table 1. 


Table 1 Experimental results of Example 1. 


C O 0.1000 [04664 [0.1031 0.0000 
f@_______0.1700_|o.1700_[0.4097_|o.1753__|0.2573 
Ja; foo _|0.0300_[0.1239 0.0000 0.0000 
eoa 0.1500 0.2625 0.0000 [0.1546 0.0000 














Bua [o2 [03500 [0.0000 [02062 [02985 
eua foso [0087s [0000 [0051s [00000 
aR 


5.2 Example 2 


Let’s consider © = {01,02,03}, and the bba m(03) = 0.7 and m(01 U @ U 63) = 0.30. 
Here, the masses of all the focal elements with cardinality size 2 equal to zero. 
For HPR, when € > 0, m(0; U @) U 63) will be divided equally and redistributed to 
{0; U 0}, {0; U 03} and {02 U 63}. Because the ratios are (taking for example £ = 
0.001) 

(0, U 0), (3)) = R(; UB, X(3)) =R(.U6;,X(3)) = ae epee = 0.3333 
In this case, HPR cannot work directly when e = 0. This shows the necessity for 
the use of ¢ > 0. The bba’s obtained from HPR,—9.99; and k—/1-—x are listed in 
Table 2. 

From the results of Examples 1 & 2, we can see that based on k— l — x, the users 
can control the number of focal elements but cannot control the maximum cardinal- 
ity of focal elements. Although based on k —/— x, the number of focal elements can 
be reduced, the focal elements with big cardinality might also be kept. This is not 
good for further reducing computational cost. But with the proposed HPR method, 
users can easily control both the non-specif city of approximated bba’s and the focal 
element’s size. 
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Table 2 Experimental results of Example 2 (e = 0.001) 


k=3 peoa [k= _e=!=6 [e=T=3 
a -0000|0-0000_[0.0503 0-000 0-000 
[20.0000 0.0000 [0.0503 [0.0000 0.0000 
e pw [o7 fos 0700 [0700 
Bua foo [01000 [0.0000 [0.0000 [0.0000 
0 











(a7; 0.0000 0.1000 [0.0000 [0.0000 [0.0000 
0000 
3000 





5.3 Example 3 


In this work, an approximation method 1 (giving mı (.)) is considered better than a 
method 2 (giving m2(.)) if both conditions are fulf lled: 1) if the distance between 
m,(.) and original bba m/(.) is smaller than the distance between m2(.) and origi- 
nal bba m(.), i.e. d(m,,m) < d(mz,m); 2) if the approximate non-specif city value 
U (mı) is closer (and lower) to the true non-specif city value U (m) than U (m2). We 
have used Jousselme’s distance [7] which has been proved recently to be a strict 
distance metric because it is commonly used in applications. The Non-specificit 

[4] is given by U(m) = Y4c@m(A)log,|A|. In this example, we make a compari- 
son between HPR (method 1) and k-additive approach (method 2). We have taken 
© = {0 02, 03, 04, 05} and generated randomly 30 bba’s using the algorithm given in 
[7]. We compute and plot d (m,m), d(m2,m), U (m), U (m1) and U (m2) for several 


levels of approximation. The results are shown in Fig. 1 and indicate clearly the 
superiority of HPR over the k-additive approach. 
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R b a 














0 10 20 30 °0 10 20 30 Oo 10 20 30 
bba’s non-specificity 
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0.025 -A-k-additive 
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Distance of evidence 
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Fig. 1 Results for the Example 3. Comparison of k-additive belief function approximation 
with HPR approximation method. (FS means Focal element Size) 
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6 Conclusions 


In this paper, a novel bba approximation called HPR has been proposed as an 
interesting alternative approach to two classical ones. With this HPR, the non- 
specif city degree can be easily controlled by the users. Our example show its be- 
havior and advantage in comparisons with other well-known bba approximation 
approaches. HPR has a low computational cost compared with k-additive approach, 
which will be discussed in a more detailed paper in future. In further works, we will 
also compare our proposed HPR with more bba approximation approaches avail- 
able in the literature. In this paper, we have used only the distance of evidence 
and the non-specif city criteria, which in fact are not enough, or comprehensive 
to evaluate eff ciently bba approximations. So in future, we will try to propose 
more eff cient evaluation criteria to evaluate and design better bba approximations 
(if possible). 
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On the Behavior of Dempster’s Rule of Combination 
and the Foundations of Dempster-Shafer Theory 


Albena Tchamova 
Jean Dezert 


Abstract—On the base of simple emblematic example we 
analyze and explain the inconsistent and inadequate behavior 
of Dempster-Shafer’s rule of combination as a valid method to 
combine sources of evidences. We identify the cause and the 
effect of the dictatorial power behavior of this rule and of its 
impossibility to manage the conflicts between the sources. For a 
comparison purpose, we present the respective solution obtained 
by the more efficient PCR5 fusion rule proposed originally in 
Dezert-Smarandache Theory framework. Finally, we identify and 
prove the inherent contradiction of Dempster-Shafer Theory 
foundations. 

Keywords—Belief functions; Dempster-Shafer Theory; DSmT; 
PCRS; contradiction. 


I. INTRODUCTION 


Dempster-Shafer Theory (DST), also known as the Theory 
of Evidence or the Theory of Belief Functions, was introduced 
by Shafer in 1976 [1] based on Dempster’s previous works [2], 
[3], [4]. This theory offers an elegant theoretical framework for 
modeling uncertainty, and provides a method for combining 
distinct bodies of evidence collected from different sources. 
In the past more than three decades, DST has been used 
in many applications, in fields including information fusion, 
pattern recognition, decision making [5], etc. 

In spite of it, starting from Zadeh’s criticism [6], [7], 
[8], many questions have arisen about the validity and the 
consistency of this theory when combining uncertain and 
conflicting evidences expressed as basic belief assignments 
(bba’s). Besides Zadeh’s example, there have been several 
detailed analyses on this topic by Lemmer [9], Voorbraak 
[10] and Wang [11]. Other authors like Pearl [12] and Walley 
[13], and more recently Gelman [14], have also warned the 
belief function community” about this fundamental problem, 
i.e., the validity of Dempster-Shafer’s rule! (DS rule for short) 
for combining distinct pieces of evidences. Since the mid- 
1990’s, many researchers and engineers working with belief 
functions in applications have observed and admitted that DS 


'This rule is also called Dempster’s rule in the literature because it was 
originally proposed by Dempster. We prefer to name it Dempster-Shafer’s rule 
because it has widely been promoted by Shafer in his development of theory 
of belief functions (a.k.a. DST). 


Originally published as Tchamova A., Dezert J., On the Behavior 
of Dempster’s Rule of Combination and the Foundations of 
Dempster-Shafer Theory, IEEE IS’2012, Sofia, Bulgaria, Sept. 6-8, 
2012 (Best paper awards), and reprinted with permission. 


tule is problematic for evidence combination, specially when 
the sources of evidence are highly conflicting. 

In response to this challenge, various attempts have been 
made to circumvent the counter-intuitive behaviors of DS 
tule. They either replace Dempster-Shafer’s rule by alternative 
tules, listed for example in [15] (Vol. 1), or apply novel 
semantic interpretations to the functions [15], [16], [17]. 
This work is based on preliminary ideas presented in the 
Spring School on Belief Functions Theory and Applications 
in April 2011 [18], and on many fruitful discussions with 
colleagues using belief functions. We start from a very basic, 
but emblematic example to show what is really questionable 
in DS rule. We demonstrate that the main problem applying 
DS rule comes not from the level of conflict between sources 
to combine, but from the underlying interpretation of evidence 
and degree of belief on which the combination rule is based. 
We make a comparison with respective results, obtained by 
using Proportional Conflict Redistribution rule no.5 (PCR5) 
defined within Dezert-Smarandache Theory (DSmT) [15]. In 
Section II we briefly recall basics of DST and DS rule. Basics 
of PCRS fusion rule are outlined in Section II. In Section 
IV we describe our basic example and discuss the counter- 
intuitive result obtained by DS rule and its strange behavior 
corresponding to the dictatorial power of particular source of 
evidence with respect to all another sources. A comparison 
with respective results obtained by PCRS fusion rule is also 
made. After a discussion on dictatorial power of DS rule in 
Section V, we establish and prove in Section VI a fundamental 
theorem on the contradiction, grounded in DST foundations. 
Concluding remarks are given in Section VII. 


II. BASICS OF DST 


Let © = {61,62,...,4,} be a frame of discernment of a 
problem under consideration containing n distinct elements 
i, i = 1,...,n. A basic belief assignment (bba, also called 
a belief mass function) m(.) : 2° — [0,1] is a mapping from 
the power set of O (i.e. the set of subsets of ©), denoted 2°, to 
[0,1], that must satisfy the following conditions: 1) m(O) = 
0, i.e. the mass of empty set (impossible event) is zero; 2) 
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Š xez20 M(X) = 1, ie. the mass of belief is normalized to 
one. The quantity m(X) represents the mass of belief exactly 
committed to X. An element X € 2° is called a focal element 
if and only if m(X) > 0. The set F(m) £ {X € 2°|m(X) > 
0} of all focal elements of a bba m(.) is called the core of the 
bba. By definition, a Bayesian bba m/(.) is a bba having only 
focal elements of cardinality 1. The vacuous bba characterizing 
full ignorance is defined by m,(.) : 2° — [0;1] such that 
M(X) =0if X Z O, and m,(O) = 1. 

From any bba m(.), the belief function Bel(.) and the 
plausibility function Pl(.) are defined for VX € 2° as: 
Bel(X) = Ley|ycx m(Y) and PI(X) = 2 yjxnrzo m(Y). 
Bel(X) represents the whole mass of belief that comes from 
all subsets of © included in X. It is interpreted as the 
lower bound of the probability of X, i.e. Pmin( X). Bel(.) 
is a subadditive measure since `g co Bel(9;) < 1. PI(X) 
represents the whole mass of belief that comes from all 
subsets of © compatible with X (i.e., those intersecting X). 
PI(X) is interpreted as the upper bound of the probability 
of X, ie. Pmax(X). Pl(.) is a superadditive measure since 
Soco PIi) = 1. Bel(X) and PI(X) are classically seen 
as lower and upper bounds of an unknown probability P(.) and 
one has the following inequality satisfied Bel(X) < P(X) < 
PUX), VX € 2°. 

The DS rule of combination [1] is a mathematical operation, 
denoted ®, which corresponds to the normalized conjunctive 
fusion rule. Based on Shafer’s model of the frame, the com- 
bination of two independent and distinct sources of evidences 
characterized by their bba mı(.) and mə2(.) and related to the 
same frame of discernment © is defined by mps(@) = 0, and 
VX € 2° \ {Ø} by 


X 
mps(X) = [m1  m](X) = aa (1) 
where 
my2(X) ê DD mı(Xı)m2(X2) (2) 
X1, X2 €29 
XıNX2=X 


corresponds to the conjunctive consensus on X between the 
two sources of evidence. Kıə2 is the total degree of conflict 
between the two sources of evidence defined by 


2 


X1,X2€2° 
X1NX2e=0 


Ki2 = m12(0) = m1 (X1)M2(X2) (3) 


When Ki2 = mi2(0) = 1, the two sources are said to 
be in total conflict and their combination cannot be applied 
since DS rule (1) is mathematically not defined because of 
0/0 indeterminacy [1]. DS rule is commutative and associative 
which makes it very attractive from engineering implemen- 
tation standpoint, since the combinations of sources can be 
done sequentially instead globally and the order doesn’t matter. 
Moreover, the vacuous bba is a neutral element for the DS 
tule, i.e. [MO mMy|(.) = [My @m](.) = m(.) for any bba m(.) 


defined on 2° which seems to be an expected? property, i.e. 
a full ignorant source doesn’t impact the fusion result. 

The conditioning of a given bba m/(.) by a conditional 
element Z € 2° \ {Ø} has been also proposed by Shafer 
[1]. This function m(.|Z) is obtained by DS combination of 
m(.) with the bba mz(.) only focused on Z, i.e. such that 
mz(Z) = 1. For any element X of the power set 2° this is 
mathematically expressed by 


m(X|Z) = [n ® mz|(X) = [mz 6 m]|(X) (4) 


It has been proved [1] that this rule of conditioning expressed 
in terms of plausibility functions yields to the formula 


PUX|Z) = PUX A Z)/PUZ) (5) 


which is very similar to the well-known Bayes formula 
P(X|Z) = P(X Z)/P(Z). Because of this, DST has been 
widely considered as a generalization of Bayesian inference 
[3], or equivalently, that probability theory is a special case 
of the Mathematical Theory of Evidence when manipulating 
Bayesian bba’s. 

Despite of the appealing properties of DS rule, its apparent 
similarity with Bayes formula for conditioning, and many 
attempts to justify its foundations, several challenges on the 
theory’s validity have been put forth in the last decades, and 
remain unanswered. For instance, an experimental protocol 
to test DST was proposed by Lemmer in 1985 [9], and 
his analysis shows an inherent paradox (contradiction) of 
DST. Following a different approach, an inconsistency in 
the fundamental postulates of DST was proved by Wang in 
1994 [11]. Some other related works have been listed in the 
introduction of this paper. In Section IV, we show through a 
basic emblematic example where does the problem of DS rule 
comes from, and why it is very risky to use it in very sensible 
applications specially where security, defense and safety are 
involved. Before this, we just recall in the next section the 
principle of the Proportional Conflict Redistribution rule no. 
5 (PCRS) defined within DSmT framework [15] to combine 
bba’s. 


III. BASICS OF PCR5 FUSION RULE 


The idea behind the Proportional Conflict Redistribution 
rule no. 5 defined within DSmT [15] (Vol. 2) is to transfer 
conflicting masses (total or partial) proportionally to non- 
empty sets involved in the model according to all integrity 
constraints. The general principle of PCR rules is to: 1) 
calculate the conjunctive consensus between the sources of 
evidences; 2) calculate the total or partial conflicting masses; 
3) redistribute the conflicting mass (total or partial) propor- 
tionally on non-empty sets involved in the model according 
to all integrity constraints. Under Shafer’s model assumption? 
of the frame ©, the PCR5 combination rule for only two 


2A discussion on this topic can be found in [19]. 
3We consider only Shafer’s model in this paper and in our examples to 
make the comparison with Dempster-Shafer’s rule results. 
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sources of information is defined as: mpors(@) = 0 and 


VX € 2° \ {9} 
mpors(X) = mi2(X)+ 
5S mi(X)*m2(Y ) 


Ye2°\{ x} my (X) + m2(Y) 
XNY=0 


m2(X)?mi(Y) 





] ©) 


All sets involved in the formula (6) are in canonical form. 
m.2(X) corresponds to the conjunctive consensus, i.e: 


D 


X1,X2€2° 
X{NX2=X 


m42(X) => m4(X1)m2(Xo). 


All denominators are different from zero. If a denominator 
is zero, that fraction is discarded. No matter how big or 
small is the conflicting mass, PCR5 mathematically does a 
better redistribution of the conflicting mass than Dempster- 
Shafer’s rule since PCR5 goes backwards on the tracks of the 
conjunctive rule and redistributes the partial conflicting masses 
only to the sets involved in the conflict and proportionally to 
their masses put in the conflict, considering the conjunctive 
normal form of the partial conflict. PCRS is quasi-associative 
and also preserves the neutral impact of the vacuous belief 
assignment, but contrariwise to DS rule the PCRS fusion rule 
doesn’t allow the dictatorial power of a source as it will be 
shown in Section IV. With PCR5 rule, the fusion result can 
always be revised as soon as informative evidences (i.e. not 
vacuous bba’s) become available. 


IV. AN EMBLEMATIC EXAMPLE SHOWING THE 
DICTATORIAL POWER OF DEMPSTER-SHAFER’S RULE 


Here we present an emblematic example showing the in- 

adequate behavior of Dempster-Shafer’s rule. We call this 
behavior the dictatorial power (DP) of DS rule realized by 
a given source, which is fundamental in DS reasoning. This 
parametric example is not related to the level of conflict 
between sources. In this example the level of conflict can 
be chosen at any low or high value. We show clearly that 
Dempster-Shafer’s rule is not responding to the combination 
of different bba’s since it provides always one and the same 
results which is not a good expected behavior for a good fusion 
rule for applications corresponding to the classical* sense of 
pooling of evidences [20]. 
Let’s consider the following frame? © = {A,B,C} with 
Shafer’s model. We consider two bba’s listed in the Table I, 
associated with two distinct bodies of evidence® with parame- 
ters a, b1, and be that can take any values, as long as a € [0, 1], 
by, b2 > 0, and bı + b2 € (0, 1]. 

We grant that all the a priori assumptions below, considered 
in DST are fulfilled: 

1) The sources of evidences are independent; 


4when putting all evidences together. 

5© could correspond by example to three distinct pathologies of a patient. 

Tn a medical context, the two sources of evidences could correspond to two 
distinct Doctors providing their own medical diagnostics for a same patient. 


TABLE I 
INPUT BBA’S ™1(.) AND m2(.). 








Focal elem. \ bba’s | 7™1(.) mal.) 
A a 0 
AUB l-a by 
C 0 1 — bı — b2 
AUBUC 0 bo 

















2) Both of sources are equally reliable, i.e both of them 
are equally truthful. As an additional third assumption in this 
parametric example we consider: 

3) Both of sources are truly informative hence no one 
represents a full ignorant source. It means both sources have 
their own specific opinions about the particular problem under 
consideration, which should be taken into account into the 
fusion process in equal rights manner. 

When applying DS rule of combination, one gets: 


1) using the conjunctive operator: 
my42(A) = a(by + b2) (7) 
mı2(AU B) = (1 — a)(bı + b2) (8) 
Kı2 = mı2(Ø) = 1 — bı — b2 (conflicting mass) (9) 
2) after normalizing by 1 — Ky. = bı + bə, the result is : 


mMmi2(A) = a(by + b2) 








A = = 
mps(A) 1— Ky by + bg 
=a = mı(A) (10) 
= mı2(AU B) = (1 — a)(by + b2) 
eee) l-K by + be 
=i f= mi(AUB) (11) 


The final result obtained by using DS rule shows clearly that: 
e Nevertheless the assumption no. 3 is fulfilled for source 
Mg/(.) (it is obviously a truly informative source of evidence), 
its opinion doesn’t count at all in the fusion process, performed 
by DS rule since one finally gets mpg(.) = mı(.). It plays in 
fact a role of full ignorant source, represented by the vacuous 
belief assignment m,(AU BUC) = 1, since mpg(.) = mi(.) 
in the DST fusion process. It is against the required a priori 
assumption no. 2 of DST, for equally reliable/truthful sources 
of evidence with opinions that have to be taken into account 
in equal terms. 

e The level of conflict K2 = 1—b,—b» encountered between 
the two sources doesn’t matter at all in DS fusion process here, 
since it can be chosen at any level, depending on the choice 
of bı + 62. No matter how high or how low the conflict is, the 
result remains one and same: mpg(.) = ™m1(.). 

In clear, the source | dictates his opinion through Dempster- 
Shafer’s rule which is what we consider a very inadequate 
behavior for solving the problem of combination of evidences 
in practice. Before analyzing this fundamental problem of 
DST, let’s first take the position of devil’s advocate, and try 
to defend the legitimacy of DST’s behavior. If we fully trust 
source 1, the hypothesis C must be ruled out of the frame, 
because Belı (C) = Pl,(C) = 0. So, according to source 1, 
the original frame of discernment © = {A, B,C} should be 
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TABLE II 
ADJUSTED INPUT BBA’S (STEP 1). 























Focal elem. \ bba’s | m1/(.) ma(.) 
a 0 
AUB l-a by 
C= 0 1— by — b2 
AUBU®O 0 b2 
TABLE III 


ADJUSTED INPUT BBA’S (STEP 2). 








Focal elem. \ bba’s | m1/(.) ma2(.) 
A a 0 
AUB l-a by + b2 
C= 0 1- by = b2 

















reduced to O’ = {A, B}, because C = —) (based on the report 
of source 1). If we consider C impossible to occur, then the 
report (bba) of source 2 must be adapted/revised according 
to Tables II and III. Because m2(.) must be a normalized 
bba, the masses of all focal elements of m2(.) are divided by 
1—m2(@) = bı +b2 so that after adjustment and normalization 
of mə2(.), the two bba’s to combine are presented in Table 
IV. Based on this reasoning, we see that the adjusted and 
normalized bba m}(.) plays indeed the role of the vacuous bba 
my(.) when working with the reduced frame 0’ = {A, B}, 
which perfectly explains the result produced by DS rule. 
Such kind of reasoning unfortunately doesn’t prove that the 
result makes sense, nor it is correct. In fact such reasoning 
shows clearly an asymmetry in the processing, since the source 
1 is assumed to provide an absolute certainty on the event 
”C cannot occur for sure”, whereas the source 2 is adjusted 
(conditioned) by the declaration of source 1. Such devil’s 
advocate reasoning is in fact fallacious, totally mistaken and 
wrong because it erroneously interprets the impossibility of 
occurrence of C as a definitive absolute truth (as if all knowl- 
edge/evidences were available at the source 1) to withdraw the 
hypothesis C of the original frame ©. In fact, the impossibility 
of C must be interpreted only as conditional truth because it 
is based only on the partial knowledge related to source 1 
(and not on the whole knowledge expressed when pooling the 
evidences of the two sources). 

Let’s, just for a comparison purpose, present the respective 
solution of our example, obtained by DSmT based PCRS5 
fusion rule. The proportional redistribution of the mass of the 
partial conflict m(A)m2(C) = a(1 — bı — b2) is done by 





ta _ to m,(A)m2(C) = a(l — by — b2) 
m4 (A) m(C) my (A) F m2(C) a+1—- by = bo 
hence sa = ÑU) and ag = Sra)” 
TABLE IV 


ADJUSTED BBA’S m1(.) AND m3(.). 











Focal elem. \ bba’s | ™m1(.) ms5(.) 
A a 0 
bi +b: at 
AUB l-a | soa E1 

















Similarly the redistribution of the partial conflict mass 
mı(AU B)m2(C) = (1 — a)(1 — bı — b2) is done by 
mı( AU B)m2(C) 
mı(AU B) + m2(C) 


YAUB 2 YG 





ma(C) 


— (la)? (1—b1 ~ba) 


ros, (1—a)(1—b1 —b2)? 
hence yausp = eee ee 


and Yc = 1—a+1—bı—b2 ° 


Therefore with PCRS, one gets a fusion result that does react 
efficiently to the values of all the masses of focal elements of 
each source since one has: 


mpcrs(A) = mi2(A) + v4 

a?(1 — bı — b2) 
a+ 1-— bı — bə 
mpors(AU B) = mi2(AU B) + yaus 
tsa 


= a(bı + b2) + (12) 


II 


(1 = a) (by + bz) + 








2-a- by = bo 
(13) 
mpcors(C) = zc + yc 
= a(l — bı — b2)? (1— a)(1 — bı — b2)? 
— a+1-—bı — by 2—a—b, — be 
(14) 


In comparison to DS rule performance, the result obtained by 
using PCRS5 rule, shows clearly that PCRS fusion rule works 
efficiently in any level of conflict, taking into account all the 
a priori assumptions (1 — 3). 


V. DISCUSSION AND ANALYSIS 


The result obtained by DS rule according to the example 
in Section IV seriously calls in question DS rule’s validity, as 
well as its applicability in real fusion problems. We claim that 
such a result is not acceptable at all. This example is more 
crucial than the examples discussed in the existing literature, 
because it shows clearly a serious flaw in DST behavior, since 
in this example the level of conflict between sources doesn’t 
play a role, so that it cannot be argued that in such case DS 
must not be applied because of high conflicting situation. We 
can choose a low conflict level and the result is still the same. 
The problem remains and the DST based result could become a 
source of dramatical consequences, especially in cases, related 
to human health or security. We claim that the problem behind 
DS rule behavior comes not from the level of conflict between 
the sources, but from something else. 


A. The dictatorial power of source’s minority opinion 


Let’s recall again the example, its strange results, and 
discuss about the reasoning process behind DS rule. The a 
priori defined finite frame of discernments © = {A,B,C} 
satisfies Shafer’s requirement for a set of truly exhaustive 
and exclusive hypotheses. Lets’s first pay attention on the bba 
associated with source 1. What is obvious and special from 
my(.), it is the fact, that Plı (C) = 0. One can reason from 
here as follows: 

1) Source 1 rules out with absolute certainty the hypothesis 
C considering it as impossible, because of Pl (C) = 0. 
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TABLE V 
INPUT BBA’S ™1(.) AND ™2(.) FOR THE CASE OF TOTAL CONFLICT. 





Focal elem. \ bba’s 














mil.) | m2) 
A a 0 
AUB l-a 0 
C 0 1 








2) The above opinion of source 1 (hypothesis C considered 
as absolutely impossible) cannot be revised if new informative 
evidence is available for fusion. According to Shafer’s defini- 
tion [1], Pl,(C) = 0 means for every X € 2° that XNC Æ 9, 
m,(X) = 0. When DS rule is applied to combine m;(.) and 
an arbitrary m’(.) (in our example m’(.) = mo(.)), for every 
Y € 2° that Y NC 40, mps(Y) = 0, because it is the sum 
of some products, each of them take one of the above mı (X) 
as a factor. Consequently, Plps(C) = 0, no matter what the 
other source of evidence is. 

3) Since with DS rule, the source 1 imposes its own opinion 
on source 2, and in fact on any other sources (as soon as they 
have a core including the core of source 1), DST supports the 
dictatorial power of a given source by accepting the minority 
opinion as a valid solution of the ”fusion of evidences’, 
and by banning in the same time all other sources’ different 
opinions. This behavior is in full contradiction with the a 
priori assumption no. 2 of DST for equally reliable sources 
of information, which means their opinions should be taken 
into account on equal terms in the fusion process (see [20] for 
a complementary analysis). 


B. On the total conflict case banned by DST 


Let’s try to reveal now what is the logic behind the case, that 
DS rule cannot solve because of the indefiniteness (0/0) - the 
case of total conflicting sources of information. We consider 
the same frame of discernments © = { A, B, C} and two bba’s 
(listed in a Table V), associated with two distinct bodies of 
evidence mj,(.) and mo(.) with parameter a € [0,1]. It is 
obvious from Table V that: 

1) Source 1 rules out with an absolute certainty hypothesis 
C considering it as impossible since Pl (C) = 0. 

2) Source 2 rules out with an absolute certainty the hy- 
potheses A and B considering them as impossible since 
Ply(AU B) =0. 

The a priori DST assumptions (1 — 3) still hold. So, the 
question is: Which source will possess the dictatorial power 
in this special case? Following Shafer’s interpretation in this 
example, the answer is: both of sources have access to the 
Absolute truth. But what is paradoxical and contradictory is 
that having simultaneously an access to the Absolute truth, 
both of sources ban mutually each other opinions. 

Therefore Shafer’s interpretation that allows both sources to 
tule out all possible Absolute truths in absolute manner leads 
to the strong contradiction by accepting the assertion that DS 
rule cannot be used in such totally conflicting case. 

This assertion is substantiated on the obtained mathematical 
indefiniteness (0/0) as impossible ”fusion result”. But actually 
behind the formal mathematical explanation, there resides 


a real and strong logic that Shafer’s distinct Absolute truth 
interpretation granted to each source doesn’t hold. The 
Absolute truth is unique and it cannot yield to contradictions 
in the fusion process. 


For a comparison purpose, let’s again to present the respec- 
tive solution in this special conflicting case, obtained by DSmT 
based PCRS fusion rule. The proportional redistribution of the 
mass of the partial conflict m (A)m2(C) = a is done by 


m,(A)m2(C) a 


LA 7 
m,(A)+m2(C) 1+a 


My (A) 


TO 


mə(C') 


5 2 
with £4 = ia 





a 


and te = I+a'’ 


Similarly the proportional redistribution of the partial con- 
flict mass m (AU B)m2(C) = (1 — a) is done by 

















YauB Yom (AU B)m2(C) 
2 
with YAUB = (a) and yc = Q-a) 
Finally, one gets using PCR5 fusion rule 
a2 
A) = = 1 
mpcrs(A) = tA Era (15) 
1—a)? 
mpcrs(AU B) = yaus = Cw (16) 
a (1 — a) 
mpers(C) = zc +¥e = 7 t+ 5a (17) 


It is obvious, DSmT based PCR5 fusion rule works effi- 
ciently even in this special total conflicting case. This very 
attractive rule is just a non-Bayesian reasoning approach, 
which is not based on such inherent contradiction, as DST, 
because PCRS doesn’t support Shafer’s interpretation of source 
committed Absolute truth and doesn’t allow dictatorial power 
of single source opinion on all other sources, involved in the 
fusion. 


C. Remark on Dempster-Shafer conditioning 


Some comments must be given also about DS conditioning 
rule (4) and the expression (5) for the conditional plausibility. 
Let consider © and two bba’s m;(.) and mg(.) defined on 
2° and their DS combination mpgs(.) = [m1 ® me](.) and 
let assume a conditioning element Z 4 Ø in 2° and the bba 
mz(Z) = 1, then 


mps(.|Z) = [mps 8 mz|(.) = [m1 6 mz E mz] (.) 


Because mps(.) = [mı © mə](.) is inconsistent with the 
probability calculus [10], [11], [13], [14], [19], then mps(.|Z) 
is also inconsistent. Therefore for any X in 2°, the conditional 
plausibility Pl(X|Z) expressed by PI(X|Z) = P(X A 
Z)/PUZ) obtained from mps(.|Z), having an apparent sim- 
ilarity with Bayes formula, is in fact not compatible with the 
conditional probability as soon as several sources of evidences 
are involved. 


181 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


VI. FUNDAMENTAL THEOREM ON THE INHERENT 
CONTRADICTION IN DST FOUNDATIONS 


On the base of the previous examples and after a detailed 
analysis of results drawn from Dempster-Shafer’s rule and 
DST reasoning discussed in previous section, we establish the 
fundamental theorem on the inherent contradiction of DST 
foundations. 


Theorem : Dempster-Shafer Theory is wrong because its 
foundation is based on an inherent logical contradiction. 


Proof : In the basis of DST [1], Shafer considers: 

e An a priori defined finite frame of discernment © = 
101, 02,...,0n} with n > 2, satisfying Shafer’s requirement 
for a set of truly exhaustive and exclusive hypotheses. Recalling 
Shafer’s statement about DST [1] (p. 36): ”This formalism is 
most easily introduced in the case where we are concerned with 
the true value of some quantity. If we denote the quantity by 0 
and the set of its possible values by ©, then the propositions of 
interest are precisely those of the form ”The true value of 0 is 
in T,” where T is a subset of O”. 

e Available independent sources of evidences associated with 
corresponding bba’s m;(.), i = 1, 2.., where all the sources are 
equally reliable/trustable and can be truly informative (not fully 
ignorant). 

e The level of conflict between the sources can take any low 
or high value strictly less than one to make Dempster-Shafer’s 
rule mathematically defined. 


On the base of above considerations, one encounters the 
fundamental contradiction: 


1) A given source of evidence m,(.) can become unrevis- 
able during the fusion when it is allowed to rule out with 
absolute certainty some hypothesis 0%, k € [1, n] in the 
frame © (if Pl,(@,) = 0 as shown in our emblematic 
example). 

2) DS rule cannot solve the case of total conflict between 
the sources (because of mathematical indefiniteness 
0/0). This corresponds to the case when both sources: 
1) have an access to the Absolute truth; 2) can become 
unrevisable during the fusion if they allowed to rule out 
with absolute certainty all hypotheses in the frame 0, 
banning mutually each other opinions. The inability of 
DS rule to solve this case strongly supports the assertion 
that the Absolute truth must be unique. Otherwise the 
total conflict case could also be solved/processed by DS 
tule. So, Shafer’s interpretation of distinct Absolute truth 
granted to each source does not hold. 


Therefore from the point 2), DST agrees with the assertion 
that the Absolute truth is unique and cannot be a contradiction. 
This assertion is fully contradicting with Shafer’s interpreta- 
tion of distinct Absolute truth granted to each source stated 
in point 1). This proves the fundamental contradiction in the 
foundations of DST and completes the proof of our Theorem. 


VII. CONCLUSION 


In this paper, we have identified and put in light the 
very serious inherent contradiction of Dempster-Shafer Theory 
foundations. On the base of simple emblematic example, we 
have analyzed and explained the inconsistent and inadequate 
behavior of Dempster-Shafer’s rule of combination as a valid 
method for the combination of sources of evidences. We have 
identified the cause and the effect of the dictatorial power 
behavior of this rule and of its impossibility to manage the 
conflicts between the sources in a consistent logical way. For 
a comparison purpose, the respective solutions obtained by 
the more adequate PCR5 fusion rule, proposed originally in 
Dezert-Smarandache Theory framework, were presented. This 
very attractive rule is corresponds to a non-Bayesian reasoning 
approach, which is not based on such inherent contradiction, as 
DST, because PCR5 doesn’t support Shafer’s interpretation of 
source committed Absolute truth and doesn’t allow dictatorial 
power of single source opinion on all other sources, involved 
in the fusion. 
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Comparative Study of Contradiction Measures 
in the Theory of Belief Functions 
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Deqiang Han 
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Abstract—Uncertainty measures in the theory of belief func- 
tions are important for the uncertainty representation and 
reasoning. Many measures of uncertainty in the theory of belief 
functions have been introduced. The degree of discord (or 
confict) inside a body of evidence is an important index for 
measuring uncertainty degree. Recently, distance of evidence 
is used to defne a contradiction measure for quantifying the 
degree of discord inside a body of evidence. The contradiction 
measure is actually the weighted summation of the distance 
values between a given basic belief assignment (bba) and the 
categorical bba’s def ned on each focal element of the given bba 
redef ned in this paper. It has normalized value and can well 
characterize the self-discord incorporated in bodies of evidence. 
We propose here, some numerical examples with comparisons 
among different uncertainty measures are provided, together 
with related analyses, to show the rationality of the proposed 
contradiction measure. 

Index Terms—Evidence theory, uncertainty measure, belief 
function, discord, conf ict. 


I. INTRODUCTION 


Dempster-Shafer evidence theory [1], also known as theory 
of belief functions, is one of the important uncertainty rea- 
soning tools. It has been widely used in many applications. 
Evidence theory can be seen as a generalization of probability 
theory, where the additivity axiom is excluded. In probability 
theory, Shannon entropy [2] is often used for quantifying 
uncertainty while in the framework of evidence theory, there 
also need the uncertainty measure for quantifying the degree 
of uncertainty incorporated in a body of evidence (BOE). 

In uncertainty theories, we can consider two types of 
uncertainty including discord (or conf ict) and non-specif city, 
hence ambiguity [3]. There have emerged several types of 
uncertainty measures in the theory of belief functions. They are 
either the generalization of Shannon entropy and other types of 
uncertainty measures in probability theory or are established 
based on the confict obtained by using some combination 
rule. For example, non-specif city [4] proposed by Dubois and 
Prade is a generalization of Hartley entropy [5]; aggregate 
uncertainty (AU) measure [6] and ambiguity measure (AM) [3] 
can be regarded as the generalized forms of Shannon entropy. 
In Martin’s work [7], [8], the auto-confict measure was 
proposed based on the conjunctive combination rule. There are 
also lots of other types of uncertainty measures in the theory of 


Originally published as: Smarandache F., Han D., Martin A. - Comparative 
Study of Contradiction Measures in the Theory of Belief Functions, in 
Proceedings of the 15th International Conference on Information Fusion, 
Singapore, 9-12 July 2012, and reprinted with permission. 


belief functions (See details in [3], [9], [11]). All the available 
uncertainty measures characterize the uncertainty either from 
one aspect (e.g. non-specif city and discord) or as a whole, i.e. 
the total uncertainty (e.g., AM and AU). 

Like in [7], [11], we attempt to break the traditional ways to 
establish uncertainty measure in the theory of belief functions. 
That is, we do not generalize the uncertainty measures in 
probability theory or use combination rule to obtain the 
uncertainty measures in theory of belief functions. In this 
paper we modify the contradiction measure proposed in [11] 
to characterize the internal confict (or discord) degree of 
the uncertainty in bba’s. For a bba with L focal elements, 
based on each focal element, a categorical bba (a bba with 
a unique focal element) can be obtained. Thus there are 
totally L categorical bba’s. We calculate Jousselme’s distance 
of evidence [10] between the original given bba and each 
categorical bba then we can obtained L values of distance. 
By using the masses of the given bba to generate the weights 
and executing weighted summation of the corresponding L 
distance values, the contradiction can be obtained. To make 
the contradiction measure be normalized, the normalization 
factor is designed and added. Some simulation results are 
provided to verify the correctness of the normalization factor. 
This contradiction measure can well characterize the conf ict 
incorporated in a BOE, i.e. the self-conf ict or internal conf ict. 
Some numerical examples with comparisons among different 
uncertainty measures in the theory of belief functions are also 
provided to show the rationality of the proposed contradiction 
measure. It should be noted that this work is based on our 
previous paper [11]. The idea of constructing contradiction 
measure based on distance of evidence is frst preliminarily 
proposed in that paper, where there exist some errors in the 
def nition -corrected here- and related analyses are far from 
enough. 


II. BASICS IN THE THEORY OF BELIEF FUNCTIONS 


A. Basic concepts in the theory of belief functions 


In Dempster-Shafer evidence theory [1], The elements in 
the frame of discernment (FOD) (denoted by ©) are mutually 
exclusive and exhaustive. Suppose that 2° denotes the pow- 
erset of FOD and def ne the function m : 2° — [0,1] as the 
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basic belief assignment (bba) satisfying: 


X m(A) = 1, m(0) =0 


ACO 


(1) 


A bba is also called a mass function. Belief function (Bel) 
and plausibility function (Pl) are def ned below, respectively: 


Bel(A) = 5 m(B) (2) 
BCA 

pl(A)= Š m(B) (3) 
ANB#O 


Suppose there are two bba’s: m1, mz over the FOD © with 
focal elements A;,...,A, and Bı,..., Bı, respectively. If 


k = Jango M1 (Ai)mo(By) < 1, m : 2° — [0,1] 
denoted by 
0, A= 
m1 (Ai)me(B;) 
m(A) = wea ' : A#O (4) 
1= ò) mi(Ai)mao(B;)’ 
AiNBj=0 


is a bba. The rule defned in Eq. (4) is called Dempster’s rule 
of combination. In Dempster’s rule of combination, 


K=1— 5 mı(Ai)m2(B;) 
A;NB;=0 


(5) 


is used to represent the conf ict between two BOEs. In recent 
research [12], both K and distance of evidence are used to 
construct a two tuple to represent the conf ict between BOEs. 


B. Uncertainty measures in the theory of belief functions 


In the theory of belief functions, a BOE hides two types 
of uncertainty: non-specif city [4] and discord, hence ambigu- 
ity [3]. The available related def nitions on degree of uncer- 
tainty in the theory of belief functions are brief y introduced 
below. 

1) Auto-conf ict 

A n-order auto-conf ict measure was proposed in [7] based 
on non-normalized conjunctive combination rule [13]. 


w=(Sn)i 


The conjunctive combination rule ® is def ned as 


Meonj(C) = XO mi(A)m2(B) := (m ®m2)(C) (7) 
ANB=C 


(6) 


When n = 2, the auto-conf ict equals to K in Dempster’s rule 
of combination. 
2) Non-specif city 


N(m) = X` m(A) logs | Al 
ACO 


(8) 


Non-specif city can be seen as weighted sum of the Hartley 
measure for different focal elements. 

3) Confusion 

Höhle proposed the measure of confusion [14] by using bba 
and belief function in spirit of entropy as follows. 


Confusion(m) = — 5 m(A)log,(Bel(A)) 
Aco 


(9) 


4) Dissonance 
Yager proposed the measure of Dissonance [14] by using 
bba and plausibility function in spirit of entropy as follows. 


Dissonance(m) = — 5 m(A)loga(PI(A)) 
AEO 


(10) 


5) Aggregate Uncertainty measure (AU) 

There have emerged several def nitions aiming to represent 
the total uncertainty in the theory of belief functions. The 
most representational one is a kind of generalized Shannon 
entropy [2], i.e. the aggregated uncertainty (AU) [6]. 

Let Bel be a belief measure on the FOD ©. The AU 
associated with Bel is measured by: 


AU(Bel) = max|— 5 po logs po] (11) 
Bel oco 

where the maximum is taken over all probability distributions 

that are consistent with the given belief function. Pg.; consists 

of all probability distributions (pọ|0 € ©) satisfying: 


po € [0,1], y0 € © 
XocoPo=1 | 
Bel(A) < Xoca po < 1- Bel(A), VACO 

As illustrated in Eq. (11) and Eq. (12), in the def nition of AU, 
the calculation of AU is an optimization problem and bba’s 
(or belief functions) are used to establish the constraints of the 
optimization problem. It is also called the ”upper entropy”. AU 
is an aggregated total uncertainty (ATU) measure, which can 
capture both non-specif city and discord. 

AU satisfes all the requirements for uncertainty mea- 
sure [9], which include probability consistency, set consis- 
tency, value range, sub-additivity and additivity for the joint 
BPA in Cartesian space. However, AU has the following short- 
comings [3]: high computing complexity, high insensitivity to 
the changes of evidence, etc. 

6) Ambiguity Measure (AM) 

Jousselme et al [3] proposed AM (ambiguity measure) 
aiming to describe the non-specif city and discord in the theory 
of belief functions. Let © = {0),02,...,0,} be a FOD. Let 
m be a bba defned on O. Def ne 


AM(m) = — X` BetPm (0) logs (BetPm (8)) 
OEO 

where BetPn(?) = J oeg, pce M(B)/|B| is the pignistic 
probability distribution proposed by Smets [16]. Jousselme 
et al [3] declared that the ambiguity measure satisfes the 
requirements of uncertainty measure and at the same time it 
overcomes the defects of AU, but in fact AM does not satisfy 
the sub-additivity which has been pointed out by Klir [17]. 
Moreover in the work of Abellan [9], AM has been proved to 

be logically non-monotonic under some circumstances. 
There are also other existing uncertainty measures in the 
theory of belief functions, see details in related reference [3]. 


(12) 


(13) 
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III]. CONTRADICTION MEASURE BASED ON DISTANCE OF 
EVIDENCE 


As we can see in the previous section, all the available 
uncertainty measures in the theory of belief functions are direct 
or indirect generalization of entropy defned in probability 
theory or are def ned by using some combination rule. Hence 
in [11], we break such ways in spirit of entropy in probability 
theory. Distance of evidence is used to construct the uncer- 
tainty degree, which is called contradiction and shown below. 


Contrm(m) = 5 m(X)-d(m,mx) (14) 
Xex 
where ¥ represents the set of all the focal elements of m(-). 
But it should be noted that the defnition in Eq. (14) is not 
a normalized value. We should obtain a normalized def nition 
for the convenience of use. 
The maximum contradiction measure for m(-) defned on 
© = {01,02,...,9,} occurs when m(-) has a uniform distri- 
bution: 


m({01}) = m({82}) = ++» = m({On}) = = 


It depends on the cardinality of O and the distance used. 


For |O| = n, we use Jousselme’s distance, we get max 
es n—1 
Contrm = \/3=. 
Proof: 


Contrm =n-—-d(m,me,) = d(m, mg, ) 


nm 
i.e.: where 


{ me; ({0:}) = 1, 
mo; ({9;}) =0,j # tij=l,.i.n 


But the distance between m and mg, is the same, 





d(m, mo; ) a (m = mg,)' Jac(m a mo, ) 
1 0 0 z 
_ 0.5 [= 1 1) 0 1 0 =e 
7% n? n? > on 0 
\ 0 0 1} | =4 
n=1 
2i 
= ieee | 
\ = 


0.5 (n—1)?+n—1 n—1 


= =f _ n2—n _ 


Therefore, in this paper, we use the normalized factor 


n-1 
a 


and then the correct normalized contradiction measure is 


def ned below: 





Contrm(m d(m,mx) (15) 


=f mex 


XEX 


To further verify the correctness of the normalization factor, 
we design the experiments as follows. 

Randomly generate 500 bba’s and calculate their corre- 
sponding contradiction values based on Eq. (15). The method 
to randomly generate bba’s is as follows [18]. 

Input: O : Frame of discernment; 

Nmax: Maximum number of focal elements 

Output: Bel: Belief function (under the form of a bba, m) 

Generate the power set of © P(O); 

Generate a random permutation of P(O) > R(O); 

Generate a integer between 1 and Nmar > k; 

FOReach First k elements of R(O) do 

Generate a value within [0,1] > mi, i = 1, ..., k; 

END 

Normalize the vector m = [mj,... 

m(Ax) = Mk; 

Algorithm 1: Random generation of bba 


Mg] > m’; 


Based on the above algorithm, the bba’s generated have 
random number of focal elements. We set the cardinality of 
FOD to be 3 and 4, respectively in each experiment. Thus we 
totally do two experiments and the experimental results are 
illustrated in Fig.1. 
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Fig. 1. Values of contradiction Contrm 


As shown in Fig.1, when |O| = 3, the max value (one) is 
obtained at the 15th bba, which is: 


m({O1}) = m({42}) = m({G3}) = 1/3. 


When |O] = 4, the max value (one) is obtained at the 489th 
bba, which is: 
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m({O1}) = m({O2}) = m({43}) = m({a}) = 1/4. 


From the proof and the experiments above, it can be seen 
that the selection of normalized factor is correct. 


IV. EXAMPLES 


A. Example 1 


In this experiment, we use the bba’s with focal elements 
of singletons and the total set. Suppose that the FOD is © = 
{61, 42, ..., 95}. The initial bba is 


m({01}) = m({02}) = m({03}) = m({04}) = m({O5}) = 0; 
m(Q) =1 

Then at each step, the mass of m(O) decreases by A = 0.05, 
and the mass of each m({6;}) increase by A/5 = 0.01, where 
i = 1,...,5. After 20 steps, m(©) will become zero and 


m({01}) = m({02}) = m({O3}) = m({Ga}) = m({45}) = 


0.2. Then the experiment will fnish. 
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Fig. 2. Comparisons among different uncertainty measures in Example 1 

As we can see in Fig. 2, although AU is deemed as a total 
uncertainty measure, we cannot detect the change of bba in 
each step based on AU. 

The values of non-specif city decrease with the increase of 
masses of singletons. 

For contradiction, K, dissonance and confusion, their values 
all increase with the increase of masses of singletons. Con- 
tradiction increases faster than K in the frst half of all the 
steps and then it increases slower than K in the second half. 
Confusion increases faster than dissonance in the frst half 
of all the steps and it increases slower than dissonance in the 
second half. The change trends of contradiction and confusion 
are more rational. Because at the frst half of all the steps, the 
relative changes of the masses of singletons increase more 
signif cantly than the relative changes in the second half. 

The value of contradiction belongs to [0,1] and it reaches 
its maximum value at the fnal step, ie.: 

When m({0i}) = m({62}) = m({0}) = m({Os}) = 
m({65}) = 0.2, Contrm = 1 


B. Example 2 


In this experiment, we use the bba’s with focal elements 
of singletons and the total set. Suppose that the FOD is 
© = {61, 02,...,05}. The initial bba is 


A = m({02}) = m({03}) = m({04}) = m({05}) = 0; 


m() 


Then at each step, the mass of m(O) decreases by A = 0.05, 
and the mass of one singleton m({61}) increase by A = 0.05 
at each step. After 20 steps, m(©) will become zero and the 
experiment will f nish. 
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Fig. 3. Comparisons among different uncertainty measures in Example 2 

As we can see in Fig. 3, with the increase of m({61}) and 
the decrease of m(Q) in each step, the AU and non-specif city 
decrease. 

Although for the original bba, the non-specif city is highest, 
the conf ict inside should be the least. So AU can not charac- 
terize the discord part of the uncertainty incorporated in the 
BOE. 

K and Dissonance cannot detect the change of bba. 

The value of the proposed contradiction increases at frst 
and reaches the max value when the bba becomes 


m({01}) = 0.5,m(®) = 0.5 


Then with the increase of m({0,}) and the decrease of m(O) 
in following steps, the value of the proposed contradiction 
decrease and it reach zero when m({01}) = 1, which is the 
clearest case. 

If we consider the two focal elements {01} and © are 
different in the power-set of ©, when their values are equal 
the uncertainty reaches the max value. This should be more 
rational. 

Confusion has the similar change trend compared to that of 
our proposed contradiction measure. But the maximum value 
of confusion does not occur at the middle. 


C. Example 3 


In this experiment, we use the bba’s with focal ele- 
ments of the same cardinality. Suppose that the FOD is 
© = {61, 02, ..., 05}. 
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The initial bba is 


m({01, A2}) = m({A1, 03}) = m({A1, A4}) 
= m({02,03}) = m({O2, A4}) = 0; m({O3, A4}) = 1 


Then at each step, the mass of m({63, 04}) decreases by A = 
0.05, and the masses of all the other focal elements increase 
by A = 0.05/5 = 0.01 at each step. After 16 steps, masses 
of all the focal elements become equal. 
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Fig. 4. Comparisons among different uncertainty measures in Example 3 


As we can see in Fig. 4, Non-specif city can not detect 
the change of bba. This is because Non-specif city mainly 
concerns the cardinality of focal elements. 

AU can detect the change of bba, but after step 10, the 
values of AU are the same with the change of bba in following 
steps. Thus AU is not sensitive to the change of bba. 

With the change of bba in each step, K and Dissonance 
change very little. Thus here K and dissonance are not so 
sensitive to the change of bba. 

For contradiction proposed and confusion, they can detect 
the change of bba well. 


D. Example 4 
Suppose that the FOD is © = {6,02}. The initial bba is 


m({O1}) =a, m({62}) = b, 
m/({01, 61}) =l-a-—b. 


Suppose that a,b € [0, 0.5], we calculate the values of all the 
uncertainty measures according to the change of a and b 

As we can see in Fig. 5, with the change of a and b, AU 
are always the same. 

All the other measures can detect the change of a and b. 

We can see that the value of the proposed contradiction 
varies relatively uniformly when compared with other meau- 
res. Thus the contradiction is not too sensitive and at the same 
time not too insensitive to the change of bba. 

The value range belongs to [0,1], which is good characteris- 
tic for being a measure for quantifying the degree of discord. 
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Comparisons among different uncertainty measures in Example 4 


V. FURTHER ANALYSIS 


In defnition of Contrm in Eq. (15), the distance used is 
Jousselme’s distance. In our work, we have also tried other 
types of distances in the theory of belief functions to construct 
the contradiction, which include 

1) Betting commitment distance (Pignistic probability dis- 
tance) 


dr(m1, m2) = max {|BetP1(A) = BetP2(A)|} (16) 
where BetP represents the pignistic probability of correspond- 
ing bba. 

2) Cuzzonlin distance 


douzz (m1, M2) = 4/ (mı, m2)” IncInc? (m1, mz) (7) 
where Inc is 
Inc(A, B) =1,1fACB (18) 
0, others 
3) Conf ict distance 
dx ((m1,mz2)) = m? (I — Inc) m2 (19) 
4) Bhattacharyya distance 
dp(mi,m2) = (1— ymi Lm)? (20) 


We do following experiments to compare the different 
contradiction measures defned on the different distance def- 
initions above. When we use dCuzz and dx to construct 
normalized contradiction measures, the normalization factor 
should be (n — 1)/n. 


A. Example 5 
Suppose that the FOD is © = {61, 02, 03}. 
The initial bba is 
m({01}) = m({42}) = m({43}) = 0;m(@) = 1 


Then at each step, the mass of m(O) decreases by 
A = 0.05, and the mass of each m({6;}) increase by 
A/3 = 0.05/3,where i = 1, 2,3. 
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Fig. 6. | Comparisons among different contradiction measures based on 
different distance measures - Example 5 


B. Example 6 


Suppose that the FOD is © = {081,02,03}. 
The initial bba is 


m({01}) = m({42}) = m({43}) = 0;m(0) = 1 


Then at each step, the mass of m(O) decreases by A = 0.05, 
and the mass of m({01}) increase by A = 0.05. In the fnal 
step, the bba obtained is 


m({0i}) =1 
m({42}) = m({43}) = m(@) = 0; 
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a i a aa a 
Steps 


Fig. 7. | Comparisons among different contradiction measures based on 
different distance measures - Example 5 


As we can see in Example 5 and 6, all the contradiction 
measures obtained based on different distance def nitions can 
well characterize the degree of discord inside BOEs. Till now, 
only Jousselme’s distance is a strict distance metric, so we 
suggest to use Jousselme’s distance. 


VI. CONCLUSION 


In this paper, we propose a new normalization of a measure 
called contradiction to characterize the degree of discord or 
conf ict inside a body of evidence. This contradiction measure 
is distance-based and it can well describe the discord part 
of the uncertainty in the theory of belief functions. Some 
numerical examples are provided to support the rationality of 
the proposed contradiction measure. 

In our work, we have also preliminarily tried other types 
of distance in evidence theory to construct the contradiction 
measure. In our future work, we will further analyze the 
contradiction defned on different distance measures. Con- 
tradiction measure can represent the qualities of different 
information sources to some extent. Thus we will also try 
to use the contradiction measure in applications based on the 
evaluation of bba’s, for example, the weights determination in 
weighted evidence combination. 
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Abstract—In this paper, we analyze Bayes fusion rule in details 
from a fusion standpoint, as well as the emblematic Dempster’s 
rule of combination introduced by Shafer in his Mathematical 
Theory of evidence based on belief functions. We propose a new 
interesting formulation of Bayes rule and point out some of its 
properties. A deep analysis of the compatibility of Dempster’s 
fusion rule with Bayes fusion rule is done. Our analysis proves 
clearly that Dempster’s rule of combination does not behave 
as Bayes fusion rule in general, because these methods deal 
very differently with the prior information when it is really 
informative (not uniform). Only in the very particular case where 
the basic belief assignments to combine are Bayesian and when 
the prior information is uniform (or vacuous), Dempster’s rule 
remains consistent with Bayes fusion rule. In more general cases, 
Dempster’s rule is incompatible with Bayes rule and it is not a 
generalization of Bayes fusion rule. 


Keywords—Information fusion, Probability theory, Bayes fusion 
rule, Dempster’s fusion rule. 


I. INTRODUCTION 


In 1979, Lotf Zadeh questioned in [1] the validity of the 
Dempster’s rule of combination [2], [3] proposed by Shafer in 
Dempster-Shafer Theory (DST) of evidence [4]. Since more 
than 30 years many strong debates [5], [6], [7], [8], [9], [10], 
[11], [12], [13] on the validity of foundations of DST and 
Dempster’s rule have bloomed. The purpose of this paper is not 
to discuss the validity of Dempster’s rule, nor the foundations 
of DST which have been already addressed in previous papers 
[14], [15], [16]. In this paper, we just focus on the deep 
analysis of the real incompatibility of Dempster’s rule with 
Bayes fusion rule. Our analysis supports Mahler’s one briefl 
presented in [17]. This paper is organized as follows. In section 
Il, we recall basics of conditional probabilities and Bayes 
fusion rule with its main properties. In section IH, we recall 
the basics of belief functions and Dempster’s rule. In section 
IV, we analyze in details the incompatibility of Dempster’s 
tule with Bayes rule in general and its partial compatibility 
for the very particular case when prior information is modeled 
by a Bayesian uniform basic belief assignment (bba). Section 
V concludes this paper. 


II. CONDITIONAL PROBABILITIES AND BAYES FUSION 


In this section, we recall the definitio of conditional 
probability [18] and present the principle and the properties of 
Bayes fusion rule. We present the structure of this rule derived 
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from the classical definitio of the conditional probability in a 
new uncommon interesting form that will help us to analyze its 
partial similarity with Dempster’s rule proposed by Shafer in 
his mathematical theory of evidence [4]. We will show clearly 
why Dempster’s rule fails to be compatible with Bayes rule in 
general. 


A. Conditional probabilities 


Let us consider two random events X and Z. The condi- 
tional probability mass functions (pmfs) P(X |Z) and P(Z|X) 
are define (assuming P(X) > 0 and P(Z) > 0) by [18]: 


P(X|Z) 4 oe and P(Z|X) ê ie (1) 
which yields to Bayes Theorem: 
P(X|Z) = ee and P(Z|X) = ee 
(2) 


where P(X) is called the a priori probability of X, and 
P(Z|X) is called the likelihood of X. The denominator P(Z) 
plays the role of a normalization constant. 


B. Bayes parallel fusion rule 


In fusion applications, we are often interested in computing 
the probability of an event X given two events Zı and Z2 
that have occurred. More precisely, one wants to compute 
P(X|Z, N Z2) knowing P(X|Z,) and P(X|Z2), where X 
can take N distinct exhaustive and exclusive states x;, i = 
1,2,...,.N. Such type of problem is traditionally called a 
fusion problem. P(X|Z, N Z2) becomes easily computable 
by assuming the following conditional statistical independence 
condition expressed mathematically by: 


(A1): P(Z N Z|X) = P(Z|X)P(Z2|X) 6) 


With such conditional independence condition (A1), then from 
Eq. (1) and Bayes Theorem one gets: 
P(X|Z1)P(X|Z2) 
P(X) 4 
N P(X=2;|Z1)P(X=;|Z2) (4) 
i=1 P(X=2;) 





P(X|Z, N Z2) = 





The rule of combination given by Eq. (4) is known as Bayes 
parallel (or product) rule and dates back to Bernoulli [19]. The 
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Eq. (4) can be rewritten as: 





1 

P(X\|Zi A Zo) = ——— ~~ - P(X|Z,)- P(X|Z 5 
(X|Z1 N Z2) KX, ZZ) (X|Zi)-P(X|Z2) (5) 

where the coefficien K(X, Z1, Z2) is define by: 
a (X = 2;|Z2) 

K(X,Z,Z2) =P 

( 1; 2) wS” X =12;) 

(6) 


C. Symmetrization of Bayes fusion rule 


The expression of Bayes fusion rule given by Eq. (4) 
can also be symmetrized in the following form that, quite 
surprisingly, rarely appears in the literature: 


1 P(X|Z1) P(X|22) 














X|Z,0Z . 7 
phlei K'(Zı, Z2) P(X) P(X) 0 
where the normalization constant K’(Z,, Z2) is given by: 
X =2;|Z) macs = 2;|Z2) 
Di, Z2) 8 
„Z D A AA o 


We call the quantity A.(X = 2;) + aca . 


entering in Eq. (8) the Agreement Factor on 


a 

= x; of order 2. The level of the Global Agreement (GA) 
of the conjunctive consensus taking into account the prior pmf 
of X is — as: 


emey” 





=. 


X = 2) 


=2i|Z1) P(X 





= K'(Z,, Z2) 


(9) 


In fact, with assumption (A1), the probability P(X|Z, N Z2) 
given in Eq. (7) is nothing but the simple ratio of the 
agreement factor A2(X) on X over the global agreement 
GA2 = > A Ao(X = Ti); that is: 


P(X|Z N Z2) = 





A(X) 
GA2 
The quantity GC measures the global conflic (i.e. the total 


conjunctive disagreement) taking into account the prior pmf of 
X. 





(10) 


N 


GC, £ 
i1, i2=1|i1Żi2 vP(X = Ti) yP(X 





(11) 





= = Pi) 


e Symbolic representation of Bayes fusion rule 


The (symmetrized form of) Bayes fusion rule of two posterior 
probability measures P(X|Z1) and P(X|Z2), given in Eq. (7), 
requires an extra knowledge of the prior probability of X. For 
convenience, we denote symbolically this fusion rule as: 


P(X|Z1N Z2) = Bayes(P(X|Z,), P(X|Z2); P(X)) (12) 
e Particular case: Uniform a priori pmf 


In such particular case, all the prior probabilities values 
VP(X =2:) = VJ1/N and </P(X =2;) = */1/N can 
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be simplifie in Bayes fusion formulas Eq. (7) and Eq. (8). 
Therefore, Bayes fusion formula (7) reduces to: 


P(X|Z1)P(X|22) 
EA P(X = z;|Z1)P(X = zi|Z2) 
By convention, Eq. (13) is denoted symbolically as: 
P(X|Z1N Z2) = Bayes(P(X|Z1), P(X|Z2)) (14 


Similarly, Bayes(P(X|Z1),...,P(X|Z;)) rule define with 
an uniform a priori pmf of X will be given by: 


P(X|Z1 N Z2) = (13) 








za P(X|Z 
PXA nn L=: xZ) (15) 
iar par P(X = ilZx) 
When P(X) is uniform one has GA3"’! + GO"! = 1. Eq. 


(13) can be expressed as: 
P(X|Z,)P(X|Z. 
pienas eS 
GA, 


P(X|Z)P(X|Za) 


iGo 
(16) 





By a direct extension, one will have: 








Zs) 


° P(X|Z °’ P(X|Z, 
P(X.. NZ) = L= PI J e) Mi= PC ae) 
GAs” 1—Gc;™ 
(17) 
N 
GAH = 5 P(X = 2;,|Z1)... P(X = x, 
41 )--;4g=1|t1=...=4, 
GOL = 1 — GAM 


D. Properties of Bayes fusion rule 


e (P1) : The pmf P(X) is a neutral element of the Bayes 
fusion rule when combining only two sources. 


Proof: A source is called a neutral element of a fusion 
rule if and only if it has no influenc on the fusion result. 
P(X) is a neutral element of Bayes rule if and only if 
Bayes(P(X|Z1), P(X); P(X)) = P(X|Z,). It can be easily 
verifie that this equality holds by replacing P(X|Z2) b 
P(X) and P(X = 2;|Z) by P(X = zx) (as if the 
conditioning term Zə vanishes) in Eq. (4). One can also ver- 
ify that Bayes(P(X), P(X|Z2); P(X)) = P(X|Z2), which 
completes the proof. 


e (P2) : Bayes fusion rule is in general not idempotent. 


Proof: A fusion rule is idempotent if the combination of all 

same inputs is equal to the inputs. To prove that Bayes rule is 

not idempotent it suffice to prove that: in general 
Bayes(P(X|4Z1), P(X|Z1); P(X)) # P(X|Z1) 


From Bayes rule (4), when P(X|Z2) = P(X|Z,) we clearly 
get in general 








1 P(X|Z1) P(X|Z1) 
P(X) SN. P(X Sai Za) P=) # P(X|Z1) (18) 
Xie 1 P(X=2; 


but when Zı and Z2 vanish, because in such case Eq. (18) 
reduces to P(X) on its left and right sides. 


e (P3) : Bayes fusion rule is in general not associative. 


Proof: A fusion rule f is called associative if and only if it 
satisfie the associative law: f(f(x, y), z) = f(a, fly,z)) = 
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f(y, f(a, z)) = f(x,y,z) for all possible inputs x, y and z. 
Let us prove Bayes rule is not associative from a very simple 
example. 


Example 1: Let us consider the simplest set of outcomes 
{x1, £2} for X, with prior pmf: 


P(X = 2) = 0.2 and P(X = x2) = 0.8 
and let us consider the three given sets of posterior pmfs: 


P(X = x1|Z1) = 0.1 and P(X = x9|Z1) = 0.9 
P(X = 2|Zy) = 0.5 and P(X = z2|Z2) = 0.5 
P(X = x1|Z3) = 0.6 and P(X = X2|Z3) = 0.4 


One can see that even if in our example one has 


f(a,fy2)) = fF@y),2) = fly, f(a,2)) because 

P(X|\(Z1 A Z2) A Z3) = P(X|Z A (ZN Z3)) = P(X|Z2 N 

(Zı N Z3)), the Bayes fusion rule is not associative since: 
P(X|(Z1 A 22) Z3) # P(X|Z1 N 22N Zs) 
P(X|Z1 N (Z2N Z3)) # P(X|Z1 N Z2 N Z3) 
P(X|Z2N (ZN Z3)) # P(X|Z1 N Z2N Zs) 


e (P4) : Bayes fusion rule is associative if and only if P(X) 
is uniform. 

Proof: If P(X) is uniform, Bayes fusion rule is given by Eq. 
(15) which can be rewritten as: 

P(X|Z1 N... N Zs-1)P(X|Zs) 


P(X|Z1n...N Zs) = 
nies ) DN P(X =95|Z1N...9 Z5-1) P(X = z:|Zs) 





Therefore when P(X) is uniform, one has: 


Bayes(P(X|Z1),...,P(X|Z5)) = 
Bayes(Bayes(P(X|Z1),..., P(X|Zs_1)), P(X|Zs)). 


e (P5) : The levels of global agreement and global conflic 
between the sources do not matter in Bayes fusion rule. 


Proof: This property seems surprising at firs glance, but, 
since the results of Bayes fusion is nothing but the ratio 
of the agreement on x; (i = 1,2,...,N) over the global 
agreement factor, many distinct sources with different global 
agreements (and this with different global conflicts can yield 
same Bayes fusion result. Indeed, the ratio is kept unchanged 
when multiplying its numerator and denominator by same non 
null scalar value. Consequently, the absolute levels of global 
agreement between the sources (and therefore of global conflic 
also) do not matter in Bayes fusion result. What really matters 
is only the proportions of relative agreement factors. 


III. BELIEF FUNCTIONS AND DEMPSTER’S RULE 


The Belief Functions (BF) have been introduced in 1976 
by Glenn Shafer in his mathematical theory of evidence [4], 
also known as Dempster-Shafer Theory (DST) in order to 
reason under uncertainty and to model epistemic uncertainties. 
The emblematic fusion rule proposed by Shafer to combine 
sources of evidences characterized by their basic belief as- 
signments (bba) is Dempster’s rule that will be analyzed in 
details in the sequel. In the literature over the years, DST has 
been widely defended by its proponents in arguing that: 1) 
Probability measures are particular cases of Belief functions; 
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and 2) Dempster’s fusion rule is a generalization of Bayes 
fusion rule. Although the statement 1) is correct because 
Probability measures are indeed particular (additive) Belief 
functions (called as Bayesian belief functions), we will explain 
why the second statement about Dempster’s rule is incorrect 
in general. 


A. Belief functions 


Let © be a frame of discernment of a problem under 
consideration. More precisely, the set © = {01,02,..., 0N} 
consists of a list of N exhaustive and exclusive elements 8;, 
i = 1,2,..., N. Each 6; represents a possible state related to 
the problem we want to solve. The exhaustivity and exclusivity 
of elements of © is referred as Shafer’s model of the frame 
©. A basic belief assignment (bba), also called a belief mass 
function, m(.) : 2° — [0,1] is a mapping from the power set 
of © (i.e. the set of subsets of ©), denoted 2°, to [0,1], that 
verifie the following conditions [4]: 


X m(X)=1 (19) 


XE2° 


m(O) =0 and 


The quantity m(X) represents the mass of belief exactly 
committed to X. An element X € 2° is called a focal element 
if and only if m(X) > 0. The set F(m) £ {X € 2°|m(X) > 
0} of all focal elements of a bba m(.) is called the core of 
the bba. A bba m(.) is said Bayesian if its focal elements 
are singletons of 2°. The vacuous bba characterizing the total 
ignorance denoted [; = 6;U62U...U@y is define by m,(.) : 
2° — [0; 1] such that m,(X) = 0 if X # O, and m,(I;) = 1. 


From any bba m(.), the belief function Bel(.) and the 
plausibility function Pl(.) are define for VX € 2° as: 


Bel(X) = Vyeelycx m(Y) 
PUX) = Vy er2eixny zo MY) 


Bel(X) represents the whole mass of belief that comes from 
all subsets of © included in X. It is interpreted as the 
lower bound of the probability of X, i.e. Pnin(X). Bel(.) 
is a subadditive measure since $` co Bel(0;) < 1. P(X) 
represents the whole mass of belief that comes from all 
subsets of © compatible with X (i.e., those intersecting X). 
PI(X) is interpreted as the upper bound of the probability 
of X, i.e. Pyax(X). Pl(.) is a superadditive measure since 
~9,c0 PUGi) = 1. Bel(X) and PI(X) are classically seen 
[4] as lower and upper bounds of an unknown probability 
P(.), and one has the following inequality satisfie VX € 29: 
Bel(X) < P(X) < PI(X). The belief function Bel(.) (and 
the plausibility function Pl(.)) built from any Bayesian bba 
m(.) can be interpreted as a (subjective) conditional probability 
measure provided by a given source of evidence, because if 
the bba m(.) is Bayesian the following equality always holds 
[4]: Bel(X) = P(X) = P(X). 


(20) 


B. Dempster’s rule of combination 


Dempster’s rule of combination, denoted DS rule is a 
mathematical operation, represented symbolically by @, which 
corresponds to the normalized conjunctive fusion rule. Based 
on Shafer’s model of ©, the combination of s > 1 independent 
and distinct sources of evidences characterized by their bba 
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my,(.), ..., Ms(.) related to the same frame of discernment 
© is denoted mps(.) = [mı ©... ® msg|(.). The quantity 
mops(.) is define mathematically as follows: mps(0) £ 0 
and VX # ) € 2° 


m2...s(X) 
xj 21 
a 1— Kı2...s gn 
where the conjunctive agreement on X is given by: 
mız... (X) = a m4(X1)mo(X2)...ms(Xs) 
X1,X2,...,XsE2° 
KINK NX =X 
(22) 
and where the global conflic is given by: 
Kye = 5 mı(Xı)m2(X2) ..-Mms(Xs) (23) 


Ki Kar EIS 

XN X2N...NX;=0 
When Ky... = 1, the s sources are in total conflic and their 
combination cannot be computed with DS rule because Eq. 
(21) is mathematically not define due to 0/0 indeterminacy 
[4]. DS rule is commutative and associative which makes it 
very attractive from engineering implementation standpoint. It 
has been proved in [4] that the vacuous bba m,(.) is a neutral 
element for DS rule because [m © my]|(.) = [my © m](.) = 
m/(.) for any bba m/(.) define on 2°. 


IV. ANALYSIS OF COMPATIBILITY OF DEMPSTER’S RULE 
WITH BAYES RULE 


To analyze the compatibility of Dempster’s rule with 
Bayes rule, we need to work in the probabilistic framework 
because Bayes fusion rule has been developed only in this 
theoretical framework. So in the sequel, we will manipulate 
only probability mass functions (pmfs), related with Bayesian 
bba’s in the Belief Function framework. If Dempster’s rule is 
a true (consistent) generalization of Bayes fusion rule, it must 
provide same results as Bayes rule when combining Bayesian 
bba’s, otherwise Dempster’s rule cannot be fairly claimed to 
be a generalization of Bayes fusion rule. In this section, we 
analyze the real (partial or total) compatibility of Dempster’s 
tule with Bayes fusion rule. Two important cases must be 
analyzed depending on the nature of the prior information 
P(X) one has in hands for performing the fusion of the 
sources. These sources to combine will be characterized by 
the following Bayesian bba’s: 


(24) 
The prior information is characterized by a given bba denoted 
by mo(.) that can be define either on 2°, or only on © if 
we want to deal for the needs of our analysis with a Bayesian 
prior. In the latter case, if mo(.) £ {mo(0;) = P(X = 2;),i 
1,2,...,N} then mo(.) plays the same role as the prior pmf 
P(X) in the probabilistic framework. 








When considering a non vacuous prior mo(.) Æ mz(.), we 
denote Dempster’s combination of s sources symbolically as: 


mops(.) = DS(mj(.),..-,7™s(.); mo(.)) 
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When the prior bba is vacuous mo(.) = m,(.) then mo(.) 
has no impact on Dempster’s fusion result, and so we denote 
symbolically Dempster’s rule as: 


DS(mi(.),...,ms(.);my(.)) = DS(m1(.),--.,ms(.)) 


mps(.) = 


A. Case 1: Uniform Bayesian prior 


It is important to note that Dempster’s fusion formula 
proposed by Shafer in [4] and recalled in Eq. (21) makes no 
real distinction between the nature of sources to combine (if 
they are posterior or prior information). In fact, the formula 
(21) reduces exactly to Bayes rule given in Eq. (17) if the bba’s 
to combine are Bayesian and if the prior information is either 
uniform or vacuous. Stated otherwise the following functional 
equality holds: 


DS(m4(.),...,Ms(.); mMo(.)) = 


Bayes(P(X|Z1),...,P(X|Z,); P(X)) (25) 


as soon as all bba’s m,(.), i = 1,2,...,s are Bayesian and 
coincide with P(X|Z;), P(X) is uniform, and either the prior 
bba mo(.) is vacuous (mo(.) = my(.)), or Mo(.) is the uniform 
Bayesian bba. 


Example 2: Let us consider O(X) = {21, £2, £3} with two 
distinct sources providing the following Bayesian bba’s: 


m,(21) = P(X = zı|Zı) >02 mMo(zı) = 05 
my (a2) = P(X = x2|Z1) = 0.3 and m(a2) = 0.1 
my (a3) = P(X = x3|Z1) = 0.5 ma(23) = 0.4 


e If we choose as prior mo(.) the vacuous bba, that is molz U 
2 U z3) = 1, then one will get (with A7jove"*’ = 0.67): 


mps(%1) = IOK M4 (©1 Mp (x1)Mo(xy U x U z3) 
= E 0.10 ey 
=r 957 0-2- 0.5-1= 5 & 0.3030 
mps(x2) = T Resm (22)m2(22)Mo(21 U x2 U a3) 
== 0.03 
=r 70.3 - 0.1-1 = 533 œ~ 0.0909 
mps(zxs) = TK pgcwous qs M4 (x3 )M2(x3)Mo (x1 U £2 U a3) 
0.20 ~ 
=S łz0.5- 0.4- 1 = 553 ~ 0.6061 


e If we choose as prior mo(.) the uniform Bayesian bba given 
by mo(#1) = Mo(z2) = mMo(#3) = 1/3, then we get: 











mps(#1) = ee orm M4 (£1)M2(x1)Mo(z1) 
0.10/3 
= s 070.2: 0.5 -1/3 = 210/83 ~ 0.3030 
mps(z2) = operam mı(x2)M2(£2)Mo(£2) 
0.03/3 
= 450.3 -0.1 - 1/3 = 2988 x 0.0909 
mps(#3) = TomTom (x3)ma(as)mo (ws) 
_ 0.20/3 
= 40.5- 0.4- 1/3 = 2" x 0.6061 


where the degree of conflic when mo(.) is Bayesian and 
uniform is now given by Kis"f°r™ — 0.89. 


Clearly Kiom 4 Kvsevous but the fusion results 
obtained with two distinct priors mo(.) (vacuous or uniform) 
are the same because of the algebraic simplificatio by 1/3 in 
Dempster’s fusion formula when using uniform Bayesian bba. 
When combining Bayesian bba’s m1(.) and ma(.), the vacuous 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


prior and uniform prior mọo(.) have therefore no impact on the 
result. Indeed, they contain no information that may help to 
prefer one particular state x; with respect to the other ones, 
even if the level of conflic is different in both cases. So, the 
level of conflic doesn’t matter at all in such Bayesian case. As 
already stated, what really matters is only the distribution of 
relative agreement factors. Only in such very particular cases 
(i.e. Bayesian bba’s, and vacuous or Bayesian uniform priors), 
Dempster’s rule is fully consistent with Bayes fusion rule. 


B. Case 2: Non uniform Bayesian prior 


Let us consider Dempster’s fusion of Bayesian bba’s with 
a Bayesian non uniform prior ™mo(.). In such case it is easy 
to check from the general structures of Bayes fusion rule and 
Dempster’s fusion rule that these two rules are incompatible. 
Indeed, in Bayes rule one divides each posterior source m; (xj) 
by </mo(a;), i = 1,2,...s, whereas the prior source mo(.) 
is combined in a pure conjunctive manner by Dempster’s 
rule with the bba’s m,(.), i = 1,2,...s, as if mo(.) was a 
simple additional source. This difference of processing prior 
information between the two approaches explains clearly the 
incompatibility of Dempster’s rule with Bayes rule when 
Bayesian prior bba is not uniform. This incompatibility is 
illustrated in the next simple example. 


Example 3: Let us consider the same frame O(X), and same 
bba’s mj (.) and mo(.) as in the Example 3. Suppose that 
the prior information is Bayesian and non uniform as follows: 
mo(21) = P(X = zı) = 0.6, mMo(x2) = P(X = T2) = 0.3 
and mo(x3) = P(X = x3) = 0.1. Bayes rule (10) yields: 











_ Ao(a1) _ 0.2:0.5/0.6 _ 0.1667 ~ 
P(ay|Zy N Z2) = KA — g BPE a = 32.2667 ~ 0.0735 
z£ .3-0. $ 0.1000 -na 
P(z|Z1N Z2) = Ra = (Rue, ~ 22587 ~ 0.0441 
P(z3|Z1 N Z2) = Gi. = -32667 = a ~ 0.8824 


Dempster’s rule yields mpg(a;) Æ P(x;|Z1 O Z2) because: 


mps(@1) = ppr ` 0-2- 0.5-0.6 = 228 x 0.6742 
mps(z2) = zyr ` 0-3- 0.1-0.3 = S88 x 0.1011 
mps(z3) = zpr ` 0-5- 0.4- 0.1 = 2028 x 0.2247 


Therefore, one has in general: 
DS(m1(.),.-.,;7™s(.); mo(.)) # Bayes(P(X|Z1),..., P(X|Zs); P(X)) 


V. CONCLUSIONS 


In this paper! we have analyzed in details the expression 
and the properties of Bayes rule of combination based on 
statistical conditional independence assumption, as well as the 
emblematic Dempster’s rule of combination of belief functions 
introduced by Shafer in his Mathematical Theory of evidence. 
We have clearly explained from a theoretical standpoint, and 
also on simple examples, why Dempster’s rule is not a gen- 
eralization of Bayes rule in general. The incompatibility of 
Dempster’s rule with Bayes rule is due to its impossibility to 
deal with non uniform Bayesian priors in the same manner 
as Bayes rule does. Dempster’s rule turns to be compatible 
with Bayes rule only in two very particular cases: 1) if all the 
Bayesian bba’s to combine (including the prior) focus on same 








lAn extended version of this paper will be presented at Fusion 2013 
conference [20]. 
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state (i.e. there is a perfect conjunctive consensus between the 
sources), or 2) if all the bba’s to combine (excluding the prior) 
are Bayesian, and if the prior bba cannot help to discriminate a 
particular state of the frame of discernment (i.e. the prior bba is 
either vacuous, or Bayesian and uniform). Except in these two 
very particular cases, Dempster’s rule is totally incompatible 
with Bayes rule. Therefore, Dempster’s rule cannot be claimed 
to be a generalization of Bayes fusion rule, even when the bba’s 
to combine are Bayesian. 
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Abstract—In this paper, we analyze Bayes fusion rule in 
details from a fusion standpoint, as well as the emblematic 
Dempster’s rule of combination introduced by Shafer in his 
Mathematical Theory of evidence based on belief functions. We 
propose a new interesting formulation of Bayes rule and point 
out some of its properties. A deep analysis of the compatibility of 
Dempster’s fusion rule with Bayes fusion rule is done. We show 
that Dempster’s rule is compatible with Bayes fusion rule only in 
the very particular case where the basic belief assignments (bba’s) 
to combine are Bayesian, and when the prior information is 
modeled either by a uniform probability measure, or by a vacuous 
bba. We show clearly that Dempster’s rule becomes incompatible 
with Bayes rule in the more general case where the prior is truly 
informative (not uniform, nor vacuous). Consequently, this paper 
proves that Dempster’s rule is not a generalization of Bayes fusion 
rule. 


Keywords—Information fusion, Probability theory, Bayes fusion 
rule, Dempster’s fusion rule. 


I. INTRODUCTION 


In 1979, Lotf Zadeh questioned in [1] the validity of the 
Dempster’s rule of combination [2], [3] proposed by Shafer in 
Dempster-Shafer Theory (DST) of evidence [4]. Since more 
than 30 years many strong debates [5], [6], [7], [8], [9], [10], 
[11], [12], [13], [14], [15] on the validity of foundations of 
DST and Dempster’s rule have bloomed. The purpose of this 
paper is not to discuss the validity of Dempster’s rule, nor 
the foundations of DST which have been already addressed in 
previous papers [16], [17], [18]. In this paper, we just focus 
on the deep analysis of the real incompatibility of Dempster’s 
rule with Bayes fusion rule. Our analysis supports Mahler’s 
one brief y presented in [19]. 


This paper is organized as follows. In section II, we recall 
basics of conditional probabilities and Bayes fusion rule with 
its main properties. In section III, we recall the basics of belief 
functions and Dempster’s rule. In section IV, we analyze in 
details the incompatibility of Dempster’s rule with Bayes rule 
in general and its partial compatibility for the very particular 
case when prior information is modeled by a Bayesian uniform 
basic belief assignment (bba). Section V concludes this paper. 


II. CONDITIONAL PROBABILITIES AND BAYES FUSION 


In this section, we recall the def nition of conditional prob- 
ability [20], [21] and present the principle and the properties of 
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Bayes fusion rule. We present the structure of this rule derived 
from the classical def nition of the conditional probability in a 
new uncommon interesting form that will help us to analyze its 
partial similarity with Dempster’s rule proposed by Shafer in 
his mathematical theory of evidence [4]. We will show clearly 
why Dempster’s rule fails to be compatible with Bayes rule in 
general. 


A. Conditional probabilities 


Let us consider two random events X and Z. The condi- 
tional probability mass functions (pmfs) P(X|Z) and P(Z|X) 
are def ned! (assuming P(X) > 0 and P(Z) > 0) by [20]: 


P(XNZ) P(XNZ) 


P(XIZ) = Boz PIX) 


and P(Z|X) £ (1) 


From Eq. (1), one gets P(X N Z) = P(X|Z)P(Z) = 
P(Z|X)P(X), which yields to Bayes Theorem: 


P(Za|X)P(X) P(X|Z)P(Z) 


aay P(Z) P(X) 


and P(Z|X) = 


(2) 
where P(X) is called the a priori probability of X, and 
P(Z|X) is called the likelihood of X. The denominator P(Z) 
plays the role of a normalization constant warranting that 
oj, P(X = «;|Z) = 1. In fact P(Z) can be rewritten as 


N 
P(Z) = X | P(Z|X = a4) P(X = 23) (3) 


The set of the N possible exclusive and exhaustive outcomes 
of X is denoted O(X) £ {x;,i = 1,2,..., N}. 


B. Bayes parallel fusion rule 


In fusion applications, we are often interested in computing 
the probability of an event X given two events Zı and Z2 
that have occurred. More precisely, one wants to compute 
P(X|Z, A Z2) knowing P(X|Z,) and P(X|Z2), where X 
can take N distinct exhaustive and exclusive states x;, i = 
1,2,...,N. Such type of problem is traditionally called a 
fusion problem. The computation of P(X|Z, N Z2) from 


'For convenience and simplicity, we use the notation P(X|Z) instead of 
P(X =2|Z = z), and P(Z|X) instead of P(Z = z|X = x) where x and 
z would represent precisely particular outcomes of the random variables X 
and Z. 
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P(X|Z,) and P(X|Z2) cannot be done in general without the 
knowledge of the probabilities P(X) and P(X|Z,UZ2) which 
are rarely given. However, P(X|Z1; N Z2) becomes easily 
computable by assuming the following conditional statistical 
independence condition expressed mathematically by: 


(Al): P(Z,|X)P(Z2|X) (4) 


With such conditional independence condition (A1), then from 
Eq. (1) and Bayes Theorem one gets: 
| P(Z102Z20X) _ P(Z,N Zo|X)P(X) 
PERE m= P(AiNZ) PZA) 
2 P(Z:|X) P(Z2|X) P(X) 
Dia P(Aa|X = wi) P(Za|X = a4) P(X = wi) 


P(Z 0 Za|X) = 


Using again Eq. (2), we have: 
P(X|Z1)P(Z1) 


and P(Z2|X) = P(X) 


and the previous formula of conditional probability P(X |Z, 
Z2) can be rewritten as: 


PXI P(X |Z) 


2 P(X) 
PIX ZoNa) = N P(X=2;|Z1)P(X=2;|Z2) (5) 
i P(X=2;) 


The rule of combination given by Eq. (5) is known as Bayes 
parallel (or product) rule and dates back to Bernoulli [22]. In 
the classif cation framework, this formula is also called the 
Naive Bayesian Classif er because it uses the assumption (A1) 
which is often considered as very unrealistic and too simplistic, 
and that is why it is called a naive assumption. The Eq. (5) 
can be rewritten as: 


1 
K(X, Zi, Z2) 


where the coeff cient K(X, Z1, Z2) is def ned by: 


P(X|Z,N Z2) = P(X|Z1)- P(X|Z2) (6) 


z Re (x 


= r) 


K(X, Zi, Z2) ê P a nize 
i=l 


(7) 


C. Symmetrization of Bayes fusion rule 


The expression of Bayes fusion rule given by Eq. (5) 
can also be symmetrized in the following form that, quite 
surprisingly, rarely appears in the literature: 


PXI) | P(X1Z2) 
VP(X) P(X) 
P(X|Z1 N Z2) = eu P(X rZ) PX =r) (8) 
i=1 JPX) \/P(X=2i) 
or in an equivalent manner: 
1 P(X|Z1) P(X|22) 
X|Z1 NZ: SSS SS SO 
P(X|Z,N Z2) = KZ) P(X) PX) (9) 
where the normalization constant K'(Z1, Z2) is given by: 
= = g;|Zı) P(X = zilZ2) 
K'(Zı, Z2) Ê sya . (10) 


JP(X =z;) VWP(X = ri) 
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P(X|Z2)P(Z2) 


JPR 


entering in Eq. (10) the Agreement Factor on 


We call the quantity A(X = 2;) 
P(X=2;|Z2) 

P(X=z;) 

= x; of order 2, because only two posterior pmfs are used 
in the derivation. Aj(X = 2;) corresponds to the posterior 
conjunctive consensus on the event X = x; taking into account 
the prior pmf of X. The denominator of Eq. (8) measures 
the level of the Global Agreement (GA) of the conjunctive 
consensus taking into account the prior pmf of X. It is 
denoted? G Ab. 


A a P(X = Li, |Z1) P(X = Tia |Z2) 
GA = 5 ZO 
ir i2=1|i1=i2 P(X =z) VP(X = z) 
N 
= r;|Zı) P(X =2,|Z 
SY A a) 
a1 VP(X = a) P(X = zi) 


(11) 


In fact, with assumption (A1), the probability P(X|Z1 N Z2) 
given in Eq. (9) is nothing but the simple ratio of the agreement 
factor A2(X) (conjunctive consensus) on X over the global 





agreement GA2 = Y Si Ao(X = zi), that is: 
A(X) 
P(X|Z N Z2) = 12 
(A141 4a) =" aa (12) 


The quantity GC given in Eq. (13) measures the global 
confict (i.e. the total conjunctive disagreement) taking into 
account the prior pmf of X. 


N 
iy i2=1]i1 Zig V P(X =x) VP(X = ti) 
e Generalization to P(X|Z1 AN Z2 N... N Zs) 


It can be proved that, when assuming conditional independence 
conditions, Bayes parallel combination rule can be generalized 
for combining s > 2 posterior pmfs as: 


1 
PU ea TS 76 ee ee P(X|Zk) 
ia ) K(X Ziea Za) Tr |Zx) 
(14) 
where the coeff cient K(X, Z1, ..., Zs) is defned by: 
N 
E ilZk)) 
X, Zu., Z) P T- T =A 
K( 1; Z. 2, a 
(15) 
The symmetrized form of Eq. (14) is: 
P(X|Z,) 
P(X|Z1 N... NA Zs 
a E e TPR) 
(16) 


with the normalization constant K’(Z1,... 


ey] kaa 


i=1 k=1 


, Zs) given by: 
=> —_ 
V/ P(X 


?The index 2 is introduced explicitly in the notations because we consider 
only the fusion of two posterior pmfs. 


K'(Zi,..., Zs (7) 
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The generalization of A2(X), GA2, and GC provides the 
agreement A,(X) of order s, the global agreement GA, and 
the global conf ict GC, for s sources as follows: 





= ti|Zk) 
Z = ti) 
ei o =a 
N 
l ilp ig=1li1=...=t. V P(X =fr) V P(X = Li.) 
N 
ae 5 P(X = el) P(X = t12) g4, 
V/ P(X = Xi, ) 


s/ 
acum i=l P(X ~~ = Ti) 


e Symbolic representation of Bayes fusion rule 


The (symmetrized form of) Bayes fusion rule of two posterior 
probability measures P(X |Z) and P(X|Z2), given in Eq. (9), 
requires an extra knowledge of the prior probability of X. For 
convenience, we denote symbolically this fusion rule as: 


P(X)) (18) 


Similarly, the (symmetrized) Bayes fusion rule of s > 2 
probability measures P(X|Z;,), k = 1,2,...,5 given by Eq. 
(16), which requires also the knowledge of P(X), will be 
denoted as: 


P(X|Z1 N Z2) = Bayes(P(X|Z,), P(X|Z2); 


P(X|Z1N...NZ,) = Bayes(P(X|Z1), ... , P(X|Zs); P(X)) 


e Particular case: Uniform a priori pmf 


If the random variable X is assumed as a priori uniformly 
distributed over the space of its N possible outcomes, then 
the probability of X is equal to P(X = a;) = 1/N fori = 
1,2,..., N. In such particular case, all the prior probabilities 


values \/P(X = xi) = \/1/N and </P( 


v/ X=2;) = </1/N 
can be simplifed in Bayes fusion formulas Eq. (9) and Eq. 
(10). Therefore, Bayes fusion formula (9) reduces to: 

P(X|Z1)P(X|Z2) 
oe P(X = a|Z1) P(X = 2;|Z2) 


By convention, Eq. (19) is denoted symbolically as: 


P(X|Z,N Z2) = (19) 


P(X|Z1 A Z2) = Bayes(P(X|Z), P(X|Z2)) (20) 


Similarly, Bayes(P(X|Z),...,P(X|Zs)) rule defned with 
an uniform a priori pmf of X will be given by: 


<a P(X|Z, 
P(X|Z N... N Zs) = =y L=: (x2) (21) 
X1 pai P(X = z:|Zx) 
When P(X) is uniform and from Eq. (19), one can redef ne 
the global agreement and the global conf ict as: 
l N 
GAZ E XO P(X =x|Z2)P(X = z;|Z2) 22 
ij=1ji=j 
l N 
GOY = XO P(X =nz:|Z)P(X = z;|Z2) 23) 
ij=1ližj 
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Because E, P(X = gla = 
x;|Z2) = i, then 


N N 
= (SI P(X aAA PX = 25122) 


N 
=X P(x 


1 and Sy), P(X 


= 2;|Z1)P(X = z;|Z2) 


a,j 


g 
= JO P(X =2)|Z%)P(X = z;|Z2) 


i j=1|i=j 
N 
+ So P(X =2,|Z,)P(X = z;|Z2) 
i,gj=1tAs 


Therefore, one has always GAX”"! +GOx""! = 1 when P(X) 
is uniform, and Eq. (19) can be expressed as: 


P(X|Z1)P(X|Z2) _ P(X|Z1) P(X|Z2) 


P(X|Zi A Z2) = ————————_ = — 
(xl ) GAZT lao 
(24) 
By a direct extension, one will have: 
aa P(XIZ pa P(X|Zk 
P(X |Z... ze) = be POX) _ Ten PAZ 
GAS™ 1—Gor"s 
(25) 
N 
GAunif = 5 P(X = t |Z1)... P(X = zi, |Zs) 
01; i.g%a=1|t1=..=ts 
coe == Gam! 


Remark 1: The normalization coeff cient corresponding to the 
global conjunctive agreement GAY”! can also be expressed 
using belief function notations [4] as: 

cae 5 


P(X = %,|4Z1)... P(X = zi, |Zs) 





Zs) 


D. Properties of Bayes fusion rule 


In this subsection, we analyze Bayes fusion rule (assuming 
condition (A1) holds) from a pure algebraic standpoint. In 
fusion jargon, the quantities to combine come from sources 
of information which provide inputs that feed the fusion 
tule. In the probabilistic framework, a source s to combine 
corresponds to the posterior pmf P(X|Z,). In this subsection, 
we establish fve interesting properties of Bayes rule. Contrary 
to Dempster’s rule, we prove that Bayes rule is not associative 
in general. 


e (P1) : The pmf P(X) is a neutral element of Bayes fusion 
rule when combining only two sources. 


Proof: A source is called a neutral element of a fusion 
tule if and only if it has no infuence on the fusion result. 
P(X) is a neutral element of Bayes rule if and only if 
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Bayes(P(X|Z,), P(X); P(X)) = P(X|Z,). It can be easily 
verifed that this equality holds by replacing P(X|Z2) by 
P(X) and P(X = 2;|Z2) by P(X = zx) (as if the 
conditioning term Z2 vanishes) in Eq. (5). One can also ver- 
ify that Bayes(P(X), P(X|Z2); P(X)) = P(X|Z2), which 
completes the proof. 


Remark 2: When considering Bayes fusion of more than 
two sources, P(X) doesn’t play the role of a neutral element 
in general, except if P(X) is uniform. For example, let us 
consider 3 pmfs P(X |Z ), P(X|Z2) and P(X|Z3) to combine 
with formula (14) with P(X) not uniform. When Z3 vanishes 
so that P(X|Z3) = P(X), we can easily check that: 


Bayes(P(X|4Z1), P(X|Z2), P(X); P(X)) 


# Bayes(P(X|Z1), P(X|Z2); P(X)) (26) 


e (P2) : Bayes fusion rule is in general not idempotent. 


Proof: A fusion rule is idempotent if the combination of all 
same inputs is equal to the inputs. To prove that Bayes rule is 
not idempotent it suff ces to prove that in general: 


Bayes(P(X|Z1), P(X|Za); P(X) # P(X|Z1) 


From Bayes rule (5), when P(X|Z2) = P(X|Z1) we clearly 
get in general 


1 P(X|Z1)P(X|Z1) 
P(X) T P(X=a;|Z1)P(X=a;|Z1) # P(X|Z1) (27) 
1 P(X=ai) 
but when Zı and Zə vanish, because in such case Eq. (27) 
reduces to P(X) on its left and right sides. 


Remark 3: In the particular (two sources) degenerate 
case where Z, and Zə vanish, one has always: 
Bayes(P(X), P(X);P(X)) = P(X). However, in 
the more general degenerate case (when considering 


more than 2 sources), one will have in general: 
Bayes(P(X), P(X),...,P(X);P(X)) # P(X), but 
when P(X) is uniform, or when P(X) is a “deterministic” 


probability measure such that P(X = x;) = 1 for a given 
x; € O(X) and P(X = z;) = 0 for all £j A xi. 


e (P3) : Bayes fusion rule is in general not associative. 


Proof: A fusion rule f is called associative if and only if it 
satisf es the associative law: f(f(x,y),z) = f(x, f(y,z)) = 
f(y, f(a, z)) = f(x,y,z) for all possible inputs x, y and z. 
Let us prove that Bayes rule is not associative from a very 
simple example. 


Example 1: Let us consider the simplest set of outcomes 
{z1, z2} for X, with prior pmf: 


P(X = z1) = 0.2 and P(X = z2) = 0.8 


and let us consider the three given sets of posterior pmfs: 


P(X = zı|Zı) = 0.1 and P(X = x2|Z1) = 0.9 
P(X = zı|Z2) = 0.5 and P(X = x2|Z2) = 0.5 
P(X = x1|Z3) => 0.6 and P(X = x2|Z3) = 0.4 


Bayes fusion Bayes(P(X|Z1), )P(X|Z2), P(X|Z3); P(X)) 
of the three sources altogether according to Eq. (16) provides: 


P(X = 21|Z1N Z2 N Zs) = Kn Ua vee 0.40 
P(X = 29219 Z2 N Zs) = iz yos Yoz Yos = 0-60 
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where the normalization constant K123 is given by: 











gon OL 05 06 | 09 05 04 ai 
3" "27272 A8 V08 AB — 


Let us compute the fusion of P(X|Z,) with P(X|Z2) using 
Bayes(P(X|Z,), P(X|Z2); P(X)). One has: 


ae = 21|Z1 N Z2) = eo $494, © 0.3077 

P(X = 20|Z1N Z2) = Ka vv = 0.6923 

where the normalization constant Ky2 is given by: 
01 05 _ 0.9 0.5 


Let us compute the fusion of P(X|Z2) with P(X|Z3) using 
Bayes(P(X|Z2), P(X|Z3); P(X)). One has 


pe OE -LE w 0.8571 


ee =2|22923) = 338% 

P(X = z2|Z2 N Z3) = Kn UG = 0.1429 

where the normalization constant K3 is given by: 
0.5 0.6 0.5 0.4 


fe a 
3 02/02. Vos Vos 


Let us compute the fusion of P(X|Z,) with P(X|Z3) using 
Bayes(P(X|Z,), P(X|Z3); P(X)). One has: 


ea 01 06 — 








1 09 04 _ 
P(X = x2|Z1 N Z3) = Kis J08 VOS — 
where the normalization constant Kıs is given by: 
0.1 0.6 0.9 0.4 


Th eg 
802 02 VOS VOS 


Let us compute the fusion of P(X|Z, N Z2) with P(X|Z3) 
using Bayes(P(X|Z1 N Z2), P(X|Z3); P(X)). One has 


ee oneal 1 0.3077 0.6 ~ 0.7273 








0.6923 0.4- x 0. 2727 


Kaas Roe V0.2 ~ 
P(X = x2|(Z1 N Z2) N Z3) = a V0.8 vos ~ 








where the normalization constant K(12)3 is given by 
0.3077 0.6 P 0.6923 0.4 
V0.2 V0.2 V0.8 V0.8 


Let us compute the fusion of P(X|Z1) with P(X|Z2 N Z3) 


K(12)3 = x 1.26925 














using Bayes(P(X|Z1), P(X|Z2 N Z3); P(X)). One has 
P(X = zı|Zı N (Z2 N Z3)) = Rice m z 0.7273 
P(X = z2|Zı N (Z2 N Z3)) = = mas Vos JOE zx 0.2727 


where the normalization constant ‘1 23) is given by 
0.1 0.8571 0.9 0.1429 
K4(23) = = V0.2 V0.2 Fe V0.8 V0.8 
Let us compute the fusion of P(X|Z1 NA Z3) with P(X|Z2) 
using Bayes(P(X|Z1M Z3), P(X|Z2); P(X)). One has 
= —_1 04 05 n 
P(X = 20|(Z1N Z3)N Z2) = = 








= 0.58931 


0.6 05 n 
Kas3)2 V0.8 V0.8 ~ 0.2727 
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where the normalization constant K(13)2 is given by 
K _ 04 0.5 n 0.6 0.5 _ 
92 J02 v0.2  V08V08 


Therefore, one sees that even if in our example one has 


Fæ, f(y,z)) = Fy), z) = fy, f(a, 2)) because 

P(X|\(Z1 N Z2) NA Z3) = P(X|Z A (Z2 N Z3)) = P(X|Z2 N 

(Zı N Z3)), Bayes fusion rule is not associative since: 
P(X|(Z1 N Z2) N Z3) # P(X|Z1 N Z2N Zs) 
P(X|Z1 N (Z2 N Z3)) # P(X|Z1 N Z2N Zs) 
P(X|Z2N (Z1 N Z3)) # P(X|Z1 N Z2N Zs) 


1.375 


e (P4) : Bayes fusion rule is associative if and only if P(X) 
is uniform. 


Proof: If P(X) is uniform, Bayes fusion rule is given by Eq. 
(21) which can be rewritten as: 
P(X|Zs) Ti- P(X1Ze) 


Vint P(X = wi|Zs) pay P(X = 2i|Zx) 


By introducing the term 1 N TJ P(X = zilZęk) in 
y 8 l i=1 l lk=1 3 
numerator and denominator of the previous formula, it comes: 
p 
M421 P(X1Ze) 
DN, Mgr P(X =i 1Ze) 
5Y Mizi P(X=2ilZe) 
i=l DN Mar Prl) 


which can be simply rewritten as: 


P(X|Zs) 
P(X|Z1N...9Zs) = 
P(X =12;|Zs) 


T P(X|Z1N...N Zs—1)P(X|Zs) 
DON, P(X = zi... N Zs-1) P(X = 2i|Zs) 





P(X|Zin...NZs) 
Therefore when P(X) is uniform, one has: 


Bayes(P(X|Z1),...,P(X|Zs)) = 
Bayes(Bayes(P(X|Z1),...,P(X|Zs_1)), P(X|Zs)) 


The previous relation was based on the decomposition of 
Ij- P(X|Ze) as P(X|Zs) Tf, P(X|Zx). This choice of 
decomposition was arbitrary and chosen only for convenience. 
In fact [[,_, P(X|Z,) can be decomposed in s different 
manners, as P(X|Z;) [Tp-ajne; P(X|Zx), j = 1,2,-..8 and 
the similar analysis can be done. In particular, when s = 3, 
we will have: 


Bayes(P(X|Z1), P(X|Z2), P(X|Z3)) = 
Bayes(Bayes(P(X|Z1), P(X|Z2)), P(X|Zs)) 
= Bayes(P(X|Z1), Bayes(P(X|Z2), P(X|Zs3))) 


which completes the proof. 


e (P5) : The levels of global agreement and global conf ict 
between the sources do not matter in Bayes fusion rule. 


Proof: This property seems surprising at frst glance, but, 
since the results of Bayes fusion is nothing but the ratio 
of the agreement on x; (i = 1,2,...,N) over the global 
agreement factor, many distinct sources with different global 
agreements (and thus with different global conf icts) can yield 
same Bayes fusion result. Indeed, the ratio is kept unchanged 
when multiplying its numerator and denominator by same non 
null scalar value. Consequently, the absolute levels of global 
agreement between the sources (and therefore of global conf ict 
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also) do not matter in Bayes fusion result. What really matters 
is only the proportions of relative agreement factors. 


Example 2: To illustrate this property, let us consider 
Bayes fusion rule applied to two distinct sets? of sources 
represented by Bayes(P(X|Z,), P(X|Z2);P(X)) and by 
Bayes(P'(X|Z,), P'(X|Z2); P(X)) with the following prior 
and posterior pmfs: 


P(X = z1) =0.2 and P(X = z2) = 0.8 


P(X = 2 1|Z,) © 0.0607 and P(X = x2|Z1) ~ 0.9393 
P(X = 2 |Z) © 0.6593 and P(X = z£2|Z2) ~ 0.3407 
XS zı|Zı 


P 
PIX = zı|Z2 


Applying Bayes fusion rule given by Eq. (5), one gets for 
Bayes(P(X|Z1), P(X|Z2); P(X)): 


enn 0.2 =1/3 


V U 


~ 0.8360 and P'(X = z2|Z1) ~ 0.1640 
~ 0.0240 and P'(X = z2|Z2) ~ 0.9760 


Ya WH YS WH 





0.2+0.4 


P(X = a9|Z1 N Z2) = ght = 2/3 


(28) 





Similarly, one gets for Bayes(P’(X|Z), P'(X|Z2); P(X)) 








PX = xı|Z1 N Z2) = OEE =e (29) 
PUX = t2|Z1 N Z2) = ESI = 2/3 





Therefore, one sees that Bayes(P(X|Z,), P(X|Z2); P(X)) = 
Bayes(P'(X|Z1), P'(X|Z2); P(X)) even if the levels of 
global agreements (and global conf icts) are different. In this 
particular example, one has: 


(GAz = 0.60) £ (GA, = 0.30) 
ee = 1.60) + (GC, = 2.05) 


In summary, different sets of sources to combine (with differ- 
ent levels of global agreement and global conf ict) can provide 
exactly the same result once combined with Bayes fusion 
tule. Hence the different levels of global agreement and global 
conf ict do not really matter in Bayes fusion rule. What really 
matters in Bayes fusion rule is only the distribution of all the 
relative agreement factors defned as A,(X = 2;)/GAs. 








(30) 


III. BELIEF FUNCTIONS AND DEMPSTER’S RULE 


The Belief Functions (BF) have been introduced in 1976 by 
Glenn Shafer in his mathematical theory of evidence [4], also 
known as Dempster-Shafer Theory (DST) in order to reason 
under uncertainty and to model epistemic uncertainties. We 
will not present in details the foundations of DST, but only 
the basic mathematical def nitions that are necessary for the 
scope of this paper. The emblematic fusion rule proposed by 
Shafer to combine sources of evidences characterized by their 
basic belief assignments (bba) is Dempster’s rule that will be 
analyzed in details in the sequel. In the literature over the years, 
DST has been widely defended by its proponents in arguing 
that: 1) Probability measures are particular cases of Belief 


3The values chosen for P(X|Z1), P(X|Z2), P’(X|Z1), P’(X|Z2) here 
have been approximated at the fourth digit. They can be precisely determined 
such that the expressions for P(X|Z1M Z2) and P’(X|Z1M Z2) as given in 
Eqs. (28) and (29) hold. For example, the exact value of P(x1|Z2) is obtained 
by solving a polynomial equation of degree 2 having as a possible solution 
P(a1|Z2) = 4 (0.72 | 0.727 — 4 x 0.04) = 0.659332590941915 = 
0.6593, etc. 
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functions; and 2) Dempster’s fusion rule is a generalization 
of Bayes fusion rule. Although the statement 1) is correct 
because Probability measures are indeed particular (additive) 
Belief functions (called as Bayesian belief functions), we will 
explain why the second statement about Dempster’s rule is 
incorrect in general. 


A. Belief functions 


Let © be a frame of discernment of a problem under 
consideration. More precisely, the set © = {01,02,..., 0N} 
consists of a list of N exhaustive and exclusive elements 0;, 
i =1,2,...,N. Each 6; represents a possible state related to 
the problem we want to solve. The exhaustivity and exclusivity 
of elements of © is referred as Shafer’s model of the frame 
©. A basic belief assignment (bba), also called a belief mass 
function, m(.) : 2° — [0,1] is a mapping from the power set 
of © (i.e. the set of subsets of ©), denoted 2°, to [0,1], that 
verif es the following conditions [4]: 


m(@)=0 and X` m(X)=1 (31) 


XE2° 


The quantity m(X) represents the mass of belief exactly 
committed to X. An element X € 2° is called a focal element 
if and only if m(X) > 0. The set F(m) £ {X € 2°|m(X) > 
0} of all focal elements of a bba m(.) is called the core of 
the bba. A bba m/(.) is said Bayesian if its focal elements 
are singletons of 2°. The vacuous bba characterizing the total 
ignorance denoted* I, = 6; U 02 U ... U On is defned by 
my(.) : 2° — [0;1] such that m,(X) = 0 if X # ©, and 
My (L) =]; 


From any bba m(.), the belief function Bel(.) and the 
plausibility function Pl(.) are defned for VX € 2° as: 


o = Vyeeycx MY) (32) 


PI(X) = Vy e2°|xny 40 m(Y) 


Bel(X) represents the whole mass of belief that comes from 
all subsets of © included in X. It is interpreted as the 
lower bound of the probability of X, ie. Pmin(X). Bel(.) 
is a subadditive measure since )/. co Bel(0;) < 1. PI(X) 
represents the whole mass of belief that comes from all 
subsets of © compatible with X (i.e., those intersecting X). 
PI(X) is interpreted as the upper bound of the probability 
of X, i.e. Pmax(X). Pl(.) is a superadditive measure since 
Ye,co PU(Gi) = 1. Bel(X) and PI(X) are classically seen 
[4] as lower and upper bounds of an unknown probability 
P(.), and one has the following inequality satisf ed VX € 2°: 
Bel(X) < P(X) < PI(X). The belief function Bel(.) (and 
the plausibility function Pl(.)) built from any Bayesian bba 
m/(.) can be interpreted as a (subjective) conditional probability 
measure provided by a given source of evidence, because if 
the bba m/(.) is Bayesian the following equality always holds 
[4]: Bel(X) = PIX) = P(X). 


4The set {01,02,...,9n} and the complete ignorance 6; U02U...U0N 
are both denoted © in DST. 
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B. Dempster’s rule of combination 


Dempster’s rule of combination, denoted DS rule® is a 
mathematical operation, represented symbolically by $, which 
corresponds to the normalized conjunctive fusion rule. Based 
on Shafer’s model of ©, the combination of s > 1 independent 
and distinct sources of evidences characterized by their bba 
my,(.), ..., Ms(.) related to the same frame of discernment 
© is denoted mps(.) = [mi ®... B m,](.). The quantity 
mps(.) is defned mathematically as follows: mps(0) £ 0 
and VX £4 € 2° 


a ™12...5(X) 


mps(X) = i= i; 


(33) 


where the conjunctive agreement on X is given by: 


m42...3(X) = 5 


P ETO E AT X,€2° 
X1NX2N...NXs=X 


m4 (X1)m2(X2) did ms(Xs) 


(34) 
and where the global conf ict is given by: 


Kio. = > 


Xi Xa, X E29 
X1iNXeN...NXs=0 


my, (X1)m2(X2) 2+. Ms (Xs) (35) 


When Kı2...s = 1, the s sources are in total conf ict and their 
combination cannot be computed with DS rule because Eq. 
(33) is mathematically not defned due to 0/0 indeterminacy 
[4]. DS rule is commutative and associative which makes it 
very attractive from engineering implementation standpoint. 


It has been proved in [4] that the vacuous bba m,(.) 
is a neutral element for DS rule because [m p my|(.) = 
[mu © m](.) = m/(.) for any bba m(.) defned on 2°. This 
property looks reasonable since a total ignorant source should 
not impact the fusion result because it brings no information 
that can be helpful for the discrimination between the elements 
of the power set 2°. 


IV. ANALYSIS OF COMPATIBILITY OF DEMPSTER’S RULE 
WITH BAYES RULE 


To analyze the compatibility of Dempster’s rule with 
Bayes rule, we need to work in the probabilistic framework 
because Bayes fusion rule has been developed only in this 
theoretical framework. So in the sequel, we will manipulate 
only probability mass functions (pmfs), related with Bayesian 
bba’s in the Belief Function framework. This perfectly justif es 
the restriction of singleton bba as a prior bba since we want 
to manipulate prior probabilities to make a fair comparison 
of results provided by both rules. If Dempster’s rule is a true 
(consistent) generalization of Bayes fusion rule, it must provide 
same results as Bayes rule when combining Bayesian bba’s, 
otherwise Dempster’s rule cannot be fairly claimed to be a 
generalization of Bayes fusion rule. In this section, we analyze 
the real (partial or total) compatibility of Dempster’s rule with 
Bayes fusion rule. Two important cases must be analyzed 
depending on the nature of the prior information P(X) one 
has in hands for performing the fusion of the sources. These 


5We denote it DS rule because it has been proposed historically by Dempster 
[2], [3], and widely promoted by Shafer in the development of DST [4]. 
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sources to combine will be characterized by the following 
Bayesian bba’s: 


= TiļZs), i = i eee N} 

(36) 
The prior information is characterized by a given bba denoted 
as mo(.) that can be defned either on 2°, or only on © if 
we want to deal for the needs of our analysis with a Bayesian 
prior. In the latter case, if mo(.)  {mo(0;) = P(X = z;),i 
1,2,...,N} then mo(.) plays the same role as the prior pmf 
P(X) in the probabilistic framework. 


mO mt Px 








When considering a non vacuous prior mo(.) #4 m,(.), we 
denote Dempster’s combination of s sources symbolically as: 


= DS(mı(.),...,Ms(.); mo(.)) 


When the prior bba is vacuous mo(.) = my(.) then mo(.) 
has no impact on Dempster’s fusion result, and so we denote 
symbolically Dempster’s rule as: 


= DS(mi(.),..- 
= DS(mi(.),..- 


mps(.) 


,Ms(.); Mv(.)) 
,ms(.)) 


mps(.) 


A. Case 1: Uniform Bayesian prior 


It is important to note that Dempster’s fusion formula 
proposed by Shafer in [4] and recalled in Eq. (33) makes no 
real distinction between the nature of sources to combine (if 
they are posterior or prior information). In fact, the formula 
(33) reduces exactly to Bayes rule given in Eq. (25) if the bba’s 
to combine are Bayesian and if the prior information is either 
uniform or vacuous. Stated otherwise the following functional 
equality holds 


DS(mj1(.),...,77s(.)3 mo(.)) = 


Bayes(P(X|Z,),...,P(X|Zs); P(X)) (37) 
as soon as all bba’s m;(.), i = 1,2,...,s5 are Bayesian and 
coincide with P(X|Z;), P(X) is uniform, and either the prior 
bba mo(.) is vacuous (mo(.) = my(.)), or Mo(.) is the uniform 
Bayesian bba. 


Example 3: Let us consider O(X) = {x1, 22,23} with two 
distinct sources providing the following Bayesian bba’s 




















my, (21) P(X x1|Z1) 0.2 ma(21) = 0.5 
my (a2) P(X x2|Z1) 0.3 and malz) =01 
mı(z3) P(X x3|Z1) 0.5 ma(ax3) = 0.4 


e If we choose as prior mọo(.) the vacuous bba, that is mo(xı U 
x2 U z3) = 1, then one will get 





mps(#1) = oem (a1 )m 2(x1)Mo(z1 U £2 U z3) 
= 40.2 -0.5 -1 = 242 ~ 0.3030 

mps(zx2ə) = ggm M (z2)m Mə(z2)Mo(zx1 U x2 U z3) 
= z403 -0.1-1 = 293 ~ 0.0909 

mps(x3) = Tepig m (3)m mə(z3)Mo(zı U x2 U z3) 
= z405 -0.4-1 = %2 x 0.6061 
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with 
Kyo" =1- my (a1 )me(x1)mMo (x1 U z2 U z3) 
= my (a2)mo(x2)mo(a1 U z2 U x3) 
= my (a3)me(x3)mo(x1 U £2 U x3) = 0.67 


e If we choose as prior mo(.) the uniform Bayesian bba given 








by mo(#1) = mo(x2) = Mmo(z3) = 1/3, then we get 
mps(«1) = poper Ta (#1 )m2(a1)mo(a1) 
0.10/3 
= zjg0.2- 0.5- 1/3 = 248 x 0.3030 
mps(z2) = ppg Tra M (22)m2(22)Mmo (z2) 
0.03/3 
= z403- 0.1- 1/3 = £3 ~ 0.0909 
mps(x3) = pper Mı (x3 )m2(x3a)mo(xs) 
0.20/3 
= z405- 0.4- 1/3 = E x 0.6061 





where the degree of confict when mo(.) is Bayesian and 
uniform is now given by KXP/er™ — 0.89. 


Clearly Kigiform ~~ Kegevous but the fusion results 
obtained with two distinct priors mo(.) (vacuous or uniform) 
are the same because of the algebraic simplif cation by 1/3 in 
Dempster’s fusion formula when using uniform Bayesian bba. 
When combining Bayesian bba’s m1(.) and ma(.), the vacuous 
prior and uniform prior mo(.) have therefore no impact on the 
result. Indeed, they contain no information that may help to 
prefer one particular state x; with respect to the other ones, 
even if the level of confict is different in both cases. So, the 
level of confict doesn’t matter at all in such Bayesian case. 
As already stated, what really matters is only the distribution 
of relative agreement factors. It can be easily verif ed that we 
obtain same results when applying Bayes Eq. (14), or (16). 


Only in such very particular cases (i.e. Bayesian bba’s, 
and vacuous or Bayesian uniform priors), Dempster’s rule is 
fully consistent with Bayes fusion rule. So the claim that 
Dempster’s is a generalization of Bayes rule is true in this 
very particular case only, and that is why such claim has been 
widely used to defend Dempster’s rule and DST thanks to its 
compatibility with Bayes fusion rule in that very particular 
case. Unfortunately, such compatibility is only partial and not 
general because it is not longer valid when considering the 
more general cases involving non uniform Bayesian prior bba’s 
as shown in the next subsection. 


B. Case 2: Non uniform Bayesian prior 


Let us consider Dempster’s fusion of Bayesian bba’s with 
a Bayesian non uniform prior mo(.). In such case it is easy 
to check from the general structures of Bayes fusion rule 
(16) and Dempster’s fusion rule (33) that these two rules are 
incompatible. Indeed, in Bayes rule one divides each posterior 
source m;(x;) by ¥/mo(x;), i = 1,2,...8, whereas the prior 
source mo(.) is combined in a pure conjunctive manner by 
Dempster’s rule with the bba’s m,(.), i = 1, 2,...5, as if mo(.) 
was a simple additional source. This difference of processing 
prior information between the two approaches explains clearly 
the incompatibility of Dempster’s rule with Bayes rule when 
Bayesian prior bba is not uniform. This incompatibility is 
illustrated in the next simple example. Mahler and Fixsen 
have already proposed in [23], [24], [25] a modif cation of 
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Dempster’s rule to force it to be compatible with Bayes 
tule when combining Bayesian bba’s. The analysis of such 
modif ed Dempster’s rule is out of the scope of this paper. 


Example 4: Let us consider the same frame O(X), and same 
bba’s mj (.) and mə2(.) as in the Example 3. Suppose that 
the prior information is Bayesian and non uniform as follows: 
mo(21) = P(X = zı) = 0.6, mo(x2) = P(X = x2) = 03 
and mo(a3) = P(X = z3) = 0.1. Applying Bayes rule (12) 
yields: 





P(@1|Z1Z2) = 46 = 220506 _ 0.1667 y 0,0735 








_ Aə(z2) _ 0.3-0.1/0.3 _ 0.1000 ~ 
P(zə|Zı N Z2) = KA = oi = 22667 X 0.0441 
P(z3|Z1N Z2) = “Gas = er = F ~ 0.8824 


Applying Dempster’s rule yields mpgs(xi) # P(x;|Zı N Z2) 
because: 


mps(zı) = ory ` 0-2- 0.5- 0.6 = 9980 ~ 0.6742 
mps(z2) = zobr ` 0-3- 0.1-0.3 = 9008 ~ 0.1011 
mps(z3) = zph 05: 0.4-0.1 = 9020 x 0.2247 


Therefore, one has in general®: 


DS(m1(.),...,Ms(.); mol.)) A 


Bayes(P(X|Z1),...,P(X|Zs);P(X)) (8) 


V. CONCLUSIONS 


In this paper, we have analyzed in details the expression 
and the properties of Bayes rule of combination based on 
statistical conditional independence assumption, as well as the 
emblematic Dempster’s rule of combination of belief functions 
introduced by Shafer in his Mathematical Theory of evidence. 
We have clearly explained from a theoretical standpoint, and 
also on simple examples, why Dempster’s rule is not a gen- 
eralization of Bayes rule in general. The incompatibility of 
Dempster’s rule with Bayes rule is due to its impossibility to 
deal with non uniform Bayesian priors in the same manner 
as Bayes rule does. Dempster’s rule turns to be compatible 
with Bayes rule only in two very particular cases: 1) if all the 
Bayesian bba’s to combine (including the prior) focus on same 
state (i.e. there is a perfect conjunctive consensus between the 
sources), or 2) if all the bba’s to combine (excluding the prior) 
are Bayesian, and if the prior bba cannot help to discriminate a 
particular state of the frame of discernment (1.e. the prior bba is 
either vacuous, or Bayesian and uniform). Except in these two 
very particular cases, Dempster’s rule is totally incompatible 
with Bayes rule. Therefore, Dempster’s rule cannot be claimed 
to be a generalization of Bayes fusion rule, even when the bba’s 
to combine are Bayesian. 
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®but in the very degenerate case when manipulating deterministic Bayesian 
bba’s, which is of little practical interest from the fusion standpoint. 
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Abstract—Since the development of belief function theory 
introduced by Shafer in seventies, many combination rules have 
been proposed in the literature to combine belief functions 
specially (but not only) in high conficting situations because 
the emblematic Dempster’s rule generates counter-intuitive and 
unacceptable results in practical applications. Many attempts 
have been done during last thirty years to propose better rules 
of combination based on different frameworks and justif cations. 
Recently in the DSmT (Dezert-Smarandache Theory) frame- 
work, two interesting and sophisticate rules (PCR5 and PCR6 
rules) have been proposed based on the Proportional Conf ict 
Redistribution (PCR) principle. These two rules coincide for the 
combination of two basic belief assignments, but they differ in 
general as soon as three or more sources have to be combined 
altogether because the PCR used in PCRS5 and in PCR6 are 
different. In this paper we show why PCR6 is better than PCR5 
to combine three or more sources of evidence and we prove 
the coherence of PCR6 with the simple Averaging Rule used 
classically to estimate the probability based on the frequentist 
interpretation of the probability measure. We show that such 
probability estimate cannot be obtained using Dempster-Shafer 
(DS) rule, nor PCRS5 rule. 


Keywords: Information fusion, belief functions, PCR6, 
PCR5, DSmT, frequentist probability. 


I. INTRODUCTION 


In this paper, we work with belief functions [1] def ned 
from the fnite and discrete frame of discernment © = 
{61,02,...,@n}. In Dempster-Shafer Theory (DST) frame- 
work, basic belief assignments (bba’s) provided by the dis- 
tinct sources of evidence are defned on the fusion space 
29 = (0,U) consisting in the power-set of ©, that is the set 
of elements of © and those generated from © with the union 
set operator. Such fusion space assumes that the elements of 
© are non-empty, exhaustive and exclusive, which is called 
Shafer’s model of ©. More generally, in Dezert-Smarandache 
Theory (DSmT) [2], the fusion space denoted G® can also 
be either the hyper-power set D° = (©,U,N) (Dedekind’s 
lattice), or super-power set! S° = (O,U,M, c(.)) depending on 
the underlying model of the frame of discernment we choose 
to ft with the nature of the problem. Details on DSm models 
are given in [2], Vol. 1. 


We assume that s > 2 basic belief assignments (bba’s) 
m,(.), i = 1,2,...,8 provided by s distinct sources of 
evidences defned on the fusion space G® are available and 
we need to combine them for a f nal decision-making purpose. 


1N and c(.) are respectively the set intersection and complement operators. 
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For doing this, many rules of combination have been proposed 
in the literature, the most emblematic ones being the simple 
Averaging Rule, Dempster-Shafer (DS) rule, and more recently 
the PCRS and PCR6 fusion rules. 


The contribution of this paper is to analyze in deep the 
behavior of PCRS and PCR6 fusion rules and to explain why 
we consider more preferable to use PCR6 rule rather than 
PCRS rule for combining several distinct sources of evidence 
altogether. We will show in details the strong relationship be- 
tween PCR6 and the averaging fusion rule which is commonly 
used to estimate the probabilities in the classical frequentist 
interpretation of probabilities. 


This paper is organized as follows. In section II, we 
brief y recall the background on belief functions and the main 
fusion rules used in this paper. Section III demonstrates the 
consistency of PCR6 fusion rule with the Averaging Rule 
for binary masses in total confict as well as the ability of 
PCR6 to discriminate asymmetric fusion cases for the fusion 
of Bayesian bba’s. Section IV shows that PCR6 can also 
be used to estimate empirical probability in a simple (coin 
tossing) random experiment. Section V will conclude and 
open challenging problem about the recursivity of fusion rules 
formulas that are sought for eff cient implementations. 


II. BACKGROUND ON BELIEF FUNCTIONS 
A. Basic belief assignment 


Lets’ consider a fnite discrete frame of discernment © = 
{61,02,...,4n}, n > 1 of the fusion problem under considera- 
tion and its fusion space G® which can be chosen either as 2°, 
D® or SF depending on the model that fts with the problem. 
A basic belief assignment (bba) associated with a given source 
of evidence is defned as the mapping m(.) : GE — [0,1] 
satisfying m(@) = 0 and jego m(A) = 1. The quantity 
m/(A) is called mass of belief of A committed by the source 
of evidence. If m(A) > 0 then A is called a focal element 
of the bba m(.). When all focal elements are singletons and 
GE = 2° then m(.) is called a Bayesian bba [1] and it is 
homogeneous to a (possibly subjective) probability measure. 
The vacuous bba representing a totally ignorant source is 
defned as m,(©) = 1. Belief and plausibility functions are 
def ned by 


Bel(A)= X` m(B) and PI(A)= X` m(B) (1) 
BCA BnA#o 
BeG® BeG? 
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B. Fusion rules 


The main information fusion problem in the belief function 
frameworks (DST or DSmT) is how to combine eff ciently 
several distinct sources of evidence represented by m(.), 
ma(.), ..-, Ms(.) (s > 2) bba’s defned on G®. Many rules 
have been proposed for such task — see [2], Vol. 2, for a 
detailed list of fusion rules — and we focus here on the 
following ones: 1) the Averaging Rule because it is the simplest 
one and it is used to empirically estimate probabilities in 
random experiment, 2) DS rule because it was historically 
proposed in DST, and 3) PCR5 and PCR6 rules because they 
were proposed in DSmT and have shown to provide better 
results than the DS rule in all applications where they have 
been tested so far. So we just brief y recall how these rules are 
mathematically def ned. 


a ; A 
e Averaging fusion rule mj 5°""2°(.) 


For any X in G9, mi'3°"79°(X) is def ned by 


anit s 


1 s 
mee = Average(m1,mz,...,ms) Ê F S mi(X) 


isegi 
i=1 
(2) 
Note that the vacuous bba m, (©) = 1 is not a neutral element 
for this rule. This Averaging Rule is commutative but it is not 
associative because in general 


Average 1 
mi2 9 (X) = glm(X) + m2(X) + m3(X)] 
is different from 





Average 
M23 (x)= zl 2 + m3(X) 
which is also different from 
Average 1 m2 (X) + m3 (X) 
milversoe( x) = EmA) + PAEL E mA) 
and also from 
Average 1 mı(X) + m3(X) 
masa (X) = = F m2(X) 


In fact, it is easy to prove that the following recursive formula 
holds 


i (X) = 


sQiy sadly s 


s—1 





verage 1 
mig. s-1(X) + >me(X) 6) 


This simple averaging fusion rule has been used since more 
than two centuries for estimating empirically the probability 
measure in random experiments [3], [4]. 


e Dempster-Shafer fusion rule mP} _,(.) 


In DST framework, the fusion space G? equals the power- 
set 2° because Shafer’s model of the frame © is assumed. 
The combination of s > 2 distinct sources of evidences 
characterized by the bba’s m;(.), i = 1,2,...,s, is done with 
DS rule as follows [1]: m?3 ,(0) = 0 and for all X 4 0 in 
29 


XıNX2N...NX; =X 
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where the numerator of (4) is the mass of belief on the conjunc- 
tive consensus on X, and where K1,2 
constant def ned by 


Xi; Xa, Xa c29 i=1 

X1ıNX2N...NXs #0 
The total degree of conf ict between the s sources of evidences 
is def ned by 


™1,2,..., s(0) = 5 [m 


P EFP ree XES i=1 
XıNX2N...NXs=0 


PETER 


The vacuous bba m,(©) = 1 is a neutral element for DS 
rule and DS rule is commutative and associative. It remains 
the milestone fusion rule of DST. The doubts on the validity 
of such fusion rule has been discussed by Zadeh in 1979 
[5]-[7] based on a very simple example with two highly 
conf icting sources of evidence. Since 1980’s, many criticisms 
have been done about the behavior and justif cation of such 
DS rule. More recently, Dezert et al. in [8], [9] have put 
in light other counter-intuitive behaviors of DS rule even in 
low conficting cases and showed serious faws in logical 
foundations of DST. 


e PCR5 and PCR6 fusion rules 


To work in general fusion spaces G® and to provide better 
fusion results in all (low or high conf icting) situations, several 
fusion rules have been developed in DSmT framework [2]. 
Among them, two fusion rules called PCRS and PCR6 based 
on the proportional conf ict redistribution (PCR) principle have 
been proved to work effciently in all different applications 
where they have been used so far. The PCR principle transfers 
the conficting mass only to the elements involved in the 
conf ict and proportionally to their individual masses, so that 
the specif city of the information is entirely preserved. 


The general principle of PCR consists: 


1) to apply the conjunctive rule; 

2) calculate the total or partial conf icting masses; 

3) then redistribute the (total or partial) conf icting mass 
proportionally on non-empty sets according to the 
integrity constraints one has for the frame ©. 


Because the proportional transfer can be done in two different 
ways, this has yielded to two different fusion rules. The PCRS 
fusion rule has been proposed by Smarandache and Dezert in 
[2], Vol. 2, Chap. 1, and PCR6 fusion rule has been proposed 
by Martin and Osswald in [2], Vol. 2, Chap. 2. 


We will not present in deep these two fusion rules since 
they have already been discussed in details with many exam- 
ples in the aforementioned references but we only give their 
expressions for convenience here. 
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The general formula of PCRS for the combination of s > 2 


sources is given by m? $®5, (Ø) = 0 and for X 4 0 in G? 


2<t<s 
Uy POOE Te<s 
1<ri<re<...<rt_1<(ri=s) 


X gyorg SG? NMX} 
{Jos--Je}EP* 1 ({1,...,n}) 
X 


(Mk= Miky (X)}). T-2( Fic Miz, (Xa) 


hi. Sane ee 
(Mk= Miky (X)) + Daal kp=ry_1t1 ix, (X; )] 





ren 


where i, j, k, r, s and t in (5) are integers. m12,....5(X) 
corresponds to the conjunctive consensus on X between 
s sources and where all denominators are different from 
zero. If a denominator is zero, that fraction is discarded; 
P*({1,2,...,n}) is the set of all subsets of k elements from 
{1,2,...,n} (permutations of n elements taken by k), the 
order of elements doesn’t count. 


The general formula of PCR6 proposed by Martin and 
Osswald for the combination of s > 2 sources is given by 
m? P6 (0) = 0 and for X 4 0 in G9 


Mig (X) STV, 25 sc, s(X)+ 
demi(X) 2 
i=1 s1 


N Yo; (k) nx=0 
k=1 


(Yo, (1) + Yoi (1) )E(GO) 9 


s—l 
LT] eit You) 
j=1 
—--. (6) 
mi(X)+Y mao) (You) 
j=1 
where o; counts from | to s avoiding i: 
oilj) =j if; <4, (7) 
oi(j)=j3+1 if7 >i, 
Since Y; is a focal element of expert/source i, 
s—1 
mi(X)+ X Mo) (Yaig)) # 0. 
j=l 


The general PCRS and PCR6 formulas (5)-(6) are 
rather complicate and not very easy to understand. From 
the implementation point of view, PCR6 is much simple 
to implement than PCRS. For convenience, very basic (not 
optimized) Matlab codes of PCRS and PCR6 fusion rules can 
be found in [2], [10] and from the toolboxes repository on the 
web [11]. The PCRS and PCR6 fusion rules are commutative 
but not associative, like the averaging fusion rule, but the 
vacuous belief assignment is a neutral element for these PCR 
fusion rules. 


The PCRS and PCR6 fusion rules simplify greatly and 
coincide for the combination of two sources (s = 2). In such 
simplest case, one always gets the resulting bba mpcrs/6(.) = 
mI STS (.) = m? S7 (.) expressed as mpcprs/6(0) = 0 and 
for all X Æ ý in G® 


Mpors/6(X) = 5 mı(Xı)mə(X2)+ 
S o mO Fm) ar S 
XnY=0 


where all denominators in (8) are different from zero. 
If a denominator is zero, that fraction is discarded. All 
propositions/sets are in a canonical form. 


Example 1: See [2], Vol.2, Chap. 1 for more examples. 


Let’s consider the frame of discernment © = {A, B} of 
exclusive elements. Here Shafer’s model holds so that G° = 
2° = {0, A, B, AUB}. We consider two sources of evidences 
providing the following bba’s 


m2(A) =0.2) m2(B)=0.3 m2(AUB) =0.5 
Then the conjunctive consensus yields : 
my42(A) = 0.44 my2(B) =0.27 mi2(AU B) = 0.05 
with the conf icting mass 


my42(A AB= 0) = my (A)me2(B) + m,(B)m2(A) 
= 0.18 + 0.06 = 0.24 


One sees that only A and B are involved in the derivation 
of the conficting mass, but not A U B. With PCR5/6, one 
redistributes the partial conficting mass 0.18 to A and B 
proportionally with the masses m,(A) and mo2(B) assigned 
to A and B respectively, and also the partial conf icting mass 
0.06 to A and B proportionally with the masses m2(A) and 
m,(B) assigned to A and B respectively, thus one gets two 
weighting factors of the redistribution for each corresponding 
set A and B respectively. Let xı be the conf icting mass to be 
redistributed to A, and y; the conf icting mass redistributed to 
B from the frst partial conf icting mass 0.18. This frst partial 
proportional redistribution is then done according 


Tı T Y1 Ba 0.18 — 


A =0.2 


03 06403 09 


whence x; = 0.6-0.2 = 0.12, yı = 0.3- 0.2 = 0.06. Now 
let z2 be the conficting mass to be redistributed to A, and 
Yə the conf icting mass redistributed to B from the second the 
partial conf icting mass 0.06. This second partial proportional 
redistribution is then done according 





tatya 000 it 


0.2+0.3 0.5 


EE 
0.2 03 
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whence x2 = 0.2 - 0.12 = 0.024, yo = 0.3 - 0.12 = 0.036. 
Thus one fnally gets: 


mpors/6(A) = 0.44 + 0.12 + 0.024 = 0.584 
mpcrs/6(B) = 0.27 + 0.06 + 0.036 = 0.366 
mpors/6(AU B) = 0.05 + 0 = 0.05 








e The difference between PCRS5 and PCR6 fusion rules 


For the two sources case, PCR5 and PCR6 fusion rules 
coincide. As soon as three (or more) sources are involved 
in the fusion process, PCRS and PCR6 differ in the way the 
proportional conf ict redistribution is done. For example, let’s 
consider three sources with bba’s my(.), ma(.) and ms3(.), 
AN B = f for the model of the frame ©, and m,(A) = 0.6, 
mə(B) = 0.3, m3(B) = 0.1. 


— With PCRS, the mass m,(A)m2(B)m3(B) = 0.6-0.3-0.1 = 
0.018 corresponding to a conf ict is redistributed back to A and 
B only with respect to the following proportions respectively: 











gho 0.01714 and 2°? 0.00086 because the 
proportionalization requires 
ca eR ma (A)m(B)ms(B) 
mı(A) mə(B)ma(B) mı(A)+ m2(B)ms3(B) 
that is 
PORS PCR5 
U si o tear 
0.6 0.03 0.6 + 0.03 
Thus 


xRORS — 0.60 - 0.02857 = 0.01714 
xBCR5 — 0,03 - 0.02857 œ~ 0.00086 


— With the PCR6 fusion rule, the partial conficting mass 
m,(A)m2(B)m3(B) = 0.6 - 0.3 -0.1 = 0.018 is redistributed 
back to A and B only with respect to the following proportions 
respectively: «{C?® = 0.0108 and x°R® = 0.0072 because 
the PCR6 proportionalization is done as follows: 


ae fe _ mi (A)m2(B)ms(B) 
mi(A) — m2(B)+m3(B) mi (A) + (m2(B) + m3(B)) 
that is 

eee gene 0.018 


= 0.018 


06 0.3401 06+(03+40.1) 


and therefore with PCR6, one gets fnally the following 
redistributions to A and B: 


ahOR6 — 0.6 - 0.018 = 0.0108 
abOR6 — (0.3 + 0.1) - 0.018 = 0.0072 


In [2], Vol. 2, Chap. 2, Martin and Osswald have proposed 
PCR6 based on intuitive considerations and the authors have 
shown through simulations that PCR6 is more stable than 
PCRS in term of decision for combining s > 2 sources of 
evidence. Based on these results and the relative ’simplicity” 
of implementation of PCR6 over PCR5, PCR6 has been 
considered more interesting/eff cient than PCRS for combining 
3 (or more) sources of evidences. 
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III. CONSISTENCY OF PCR6 WITH THE AVERAGING RULE 


In this section we show why we also consider PCR6 
as better than PCRS for combining bba’s. But here, our 
argumentation is not based on particular simulation results 
and decision-making as done by Martin and Osswald, but on 
a theoretical analysis of the structure of PCR6 fusion rule 
itself. In particular, we show the full consistency of PCR6 rule 
with the averaging fusion rule used to empirically estimate 
probabilities in random experiments. For doing this, it is 
necessary to simplify the original PCR6 fusion formula (6). 
Such simplif cation has already been proposed in [12] and the 
PCR6 fusion rule can be in fact rewritten as 

Wee (X) = m1,2,...,8(X)+ 


PE 


2 DS 


k=1 Xii Xig yey Xip CGO\X (t1stas te EPS ({1,.-,8}) 
(Mar Xi, )NX=0 
[mi (X) + mig(X) +... + ma, (X)]- 
i Miz (X) -Mik (X) Mipya (Xira) +++ Mis (Xis) 
Mi (X) Pers + mi, (X) = Ming, (Kings) Pest Go 
9) 


where P*({1,...,5}) is the set of all permutations of 
the elements {1,2,...,s}. It should be observed that X;,, 
Xiz». - Xi, may be different from each other, or some of them 
equal and others different, etc. 


We wrote this PCR6 general formula (9) in the style of 
PCRS, different from Arnaud Martin & Christophe Oswald’s 
notations, but actually doing the same thing. In order not 
to complicate the formula of PCR6, we did not use more 
summations or products after the third Sigma. 


We now are able to establish the consistency of general 
PCR6 formula with the Averaging fusion rule for the case of 
binary bba’s through the following theorem 1. 


Theorem 1: When s > 2 sources of evidences provide binary 
bba’s on GÈ whose total conf icting mass is 1, then the PCR6 
fusion rule coincides with the averaging fusion rule. Otherwise, 
PCR6 and the averaging fusion rule provide in general different 
results. 


Proof 1: All s > 2 bba’s are assumed binary, i.e. m(X) = 0 
or | (two numerical values 0 and 1 only are allowed) for any 
bba m/(.) and for any set X in the focal elements. A focal 
element in this case is an element X such that at least one of 
the s binary sources assigns a mass equals to | to X. Let’s 
suppose the focal elements are Fi, F>,..., Fn.. Then the set 
of bba’s to combine can be expressed as in the Table I. where 


Table I. 
Boas | Foss dem ET TT 
Pom) |=] 


LIST OF BBA’S TO COMBINE. 





e all x are 0’s or 1’s; 


e on each row there is only a 1 (since the sum of 
all masses of a bba is equal to 1) and all the other 
elements are 0’s; 
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e also each column has at least an 1 (since all elements 
are focals; and if there was a column corresponding 
for example to the set F, having only 0’s, then it 
would result that the set Fp is not focal, i.e. that all 


m(Fp) = 0). 


Using PCR6, we frst need to apply the conjunctive rule 
to all s sources, and the result is a sum of products of the 
form m (X1)mo(X2)...ms(X s) where X1, X2,..., X5, are 
= focal elements F1, F2,...,Fn 1n various permutations, with 
s n. If s > n some focal elements X; are repeated in 
the p lie my1(X1)m2(X2)...ms(X;5). But there is only one 
product of the form mi (X1)m(Xo) ..-Ms(Xs) = 1 which 
is not equal to zero, i.e. that product which has each factor 
equals to 1” (i.e. the product that collects from each row the 
existing single 1). Since the total conficting mass is equal to 
1, it means that this product represents the total confict. In 
this case the PCR6 formula (9) becomes 


mis .(X) = 0+ 
s—1 


2 3 2 


k=1 X; Xiz o Xip EGON\X (41 sas ste EPS ({1,---8}) 
(jai Xi; )NX=0 
1-1-...-1-1-...-1 


14+1+...41]]- 
| l et Oe rere ae a er ae 


(10) 


The previous expression can be rewritten as 


s—1 


mPors | 5 5 k- “ 


k=l Xi, Xizr Xi, Ea Nx oa T 
EA s 
(Nea Xi;)0X=0 


which is equal to k/s since there is only one possible non- 
null product of the form mı(Xı)mə(X2)...Ms(Xs), and all 
other products are equal to zero. Therefore, we fnally get: 


k 
mrs nns(X) = = (11) 
where ”k” is the number of bba’s m/(.) which give m(X) = 1. 
Therefore PCR6 in this case reduces to the average of masses, 
which completes the proof | of the theorem. 


Proof 2: A second method of proving this theorem can also be 
done as follows. Let m1(.), ma(.), ..., ms(.), for s > 3, be 
bba’s of the sources of information to combine and denote F = 
{F,, Fo,..., Fn}, for n > 2, the set of all focal elements. All 
sources give only binary masses, i.e. Mmy( F1) = 0 or mz (Fr) = 
1 for any k € {1,2,...,s} and any l € {1,2,...,n}. Since 
each F;, 1 < i < n, is a focal element, there exists at least 
a bba m;,(.) such that m; (Fi) = 1, otherwise (i.e. if all 
sources gave the mass of F; be equal to zero) F; would not be 
focal. Without reducing the generality of the theorem, we can 
regroup the masses (since we combine all of them at once, so 
their order doesn’t matter), as in Table II. Of course 7, + 72 + 

..+%n = s, since the s bba’s are the same but reordered, and 
ii >1,% >1,..., and in > 1. The total conficting mass 
according to the theorem hypothesis m1,2,...5(0) is 1. With 
the PCR6 fusion rule we transfer the confict mass back to 
focal elements F3, F2, ... Fn respectively according to PCR 
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Table II. 
[poas \ Focal elem. Ee | te eee H] 


LIST OF REORDERED BINARY BBA’S. 





principle such that: 
LP, _ TFs 


Ttt  14+1+4+...41 
—-_———’ eC” 


71 times i2 times 
a LF, o mge.. s) 1 
~ Tt14+...41 tp tigot... +i, 8 
in times 

whence tp, = i/s, tr, = i2/8, TF, in/S. 
Therefore miS (Fi) = i/s, mP CRE (Fo) = ie/s, 
mS’ (Fn) = in/s. But averaging the masses mj(.), 
mal. ), -++5 Ms(.) is equivalent to averaging each column of 


Fi, Fo, ... Fn. Hence average of column F; is i1/s, average 
of column F> is i2/s, ..., average of column Fn is in/s. 
Therefore, in case of binary bba’s which are globally totally 
conf icting, PCR6 rule is equal to the Averaging Rule. This 
completes the proof 2 of the theorem. 


Note that using PCRS fusion rule, we also transfer the 
total conficting mass that is equal to 1 to F, Fo, ..., 








Fn respectively, but we replace the addition ”+” with the 
multiplication in the above proportionalizations: 

TF o TF3 _ LE, — m2... s0) 1 
Taisit Llera] 1-1 1 i+1+...¢1 7 
S ee N a Sa 


iq times ig times in times n times 


so that £F, = 1/n, £F, = 1/n, .. 


miz s(Fı) = 


., £F, = l/n and therefore 


oS ° (Foa) Epi =e 3 (Fr) =1/n 


Ma 2. 


Corollary 1: When s > 2 sources of evidences provide binary 
bba’s on G® with at least two focal elements, and all focal 
elements are disjoint two by two, then PCR6 fusion rule 
coincides with the Averaging Rule. 


This Corollary is true because if all focal elements are 
disjoint two by two then the total conf ict is equal to 1. 


Examples 2: where PCR6 rule equals the Averaging Rule. 


Let’s consider the frame © = {A, B} with Shafer’s model 
and the bba’s to combine as given in Table III. 
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Table MI. 


boas \ Fora elem | A] B [AUB ANB=O] 
Pom) ioo e 


LIST OF BBA’S TO COMBINE FOR EXAMPLE 2. 


[_ mO [ofif o ff | 
C mO [lofo] tT 
[L ms Toto oe f 1 | 





Since we have binary masses, and their total confict is 1, we 
expect getting the same result for PCR6 and the Averaging 
Rule according to our Theorem 1. The PCR principle gives us 


mı 23) 1 


liti 3 


TA YB ZAUB 


1 1 1 











Hence x4 = yg = ZauB = 5, so that 


1 1 
mi s3 (A) =mi.2,3(A) +24 =0+ 3. 3 
1 1 
miss (B) =mi23(B) + ye =0+ a 5 


1 1 
ies (AU B) = mı,2,3(AU B) + zauB = 0+ 3 = 3 





Interestingly, PCRS gives the same result as PCR6 in this case 
since one makes the same proportionalizations as for PCR6. 
Using the Averaging Rule (2), we get 








verage 1 1 
moa 2 (4)=3:(1+0+0)=3 

verage 1 1 
mi3 i (B)=3:(0+1+0)=3 

verage 1 I 
mea T (4UB)=3:(0+0+1)=53 


So we see that PCR6 rule equals the Averaging Rule 
as proved in the theorem because the bba’s are binary 
and the intersection of all focal elements is empty since 
ANBN(AUB) =ØN(AU B) = Í because AN B = 0 
since Shafer’s model has been assumed for the frame ©. 


Examples 3: where PCR6 differs from the Averaging Rule. 


Let’s consider the frame © = {A,B,C} with Shafer’s 
model and the bba’s to combine as given in Table IV. 


Table IV. LIST OF BBA’S TO COMBINE FOR EXAMPLE 3. 


bbas Focal elem | A | AUB] AUBUC YO] 
ro 


Joma) [of i ] 0o Jf | 
[L ms Jit 0 ] 0 I | 





Clearly, in this case the focal elements are nested and the 
condition on emptiness of intersection of all focal elements is 
not satisf ed because one has AN (AU B) NA (AU BUC) = 
A # @, so that the theorem cannot be applied in such case. The 
total conf icting mass is not 1. One can verify in such example 
that PCR6 rule differs from the Averaging Rule because one 
gets 


mio3 (A) =m1,2,3(A) =1 
mts3°(AU B) =mi2,3(AU B) =0 
miss (AU Bu C) = m1,2,3(A UBU C) =0 
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since there is no conf icting mass to redistribute to apply PCR 
principle, whereas the averaging fusion rule gives 





verage 1 ii 
ge (A)=5-0 0+0)=5 
verage 1 1 
ase (AUB) =3-0+14+0=5 
1 
miZ (AU BUC) ==-(0+04+1)=5 


3 


Examples 4 (Bayesian non-binary bba’s): where PCR6 
differs from the Averaging Rule. 


Let’s consider the frame © = {A, B} with Shafer’s model 
and the Bayesian bba’s to combine as given in Table V. 


Table V. LIST OF BBA’S TO COMBINE FOR EXAMPLE 4. 





0.084 [ 0.096 


© 
foe) 
iS) 
S| 


The total conf icting mass m1,2,3( ANB = 0) = 0.82 = 1— 
mı(A)m2(A)m3(A) = mı(B)mə(B)m3(B) equals the sum 


of partial conf icting masses that will be redistributed through 
PCR principle in PCR6 
m1,2,3(AN B = 0) = m1(A)m2(B)ma(B) 
ay ee 


0.024 
+ m2(A)mi(B)m3(B) +m3(A)mi(B)me2(B) 


0.144 0.224 
+mi(B)m2(A)m3(A) + m2(B)mi(A)m3(A) 


0.336 0.056 
+ m3(B)m1(A)m2(A) = 0.82 
a 


0.036 








Applying PCR principle for each of these six partial conf icts, 
one gets: 


e for m,(A)m2(B)m3(B) = 0.2 - 0.4 - 0.3 = 0.024 
0.2  0.4+0.3 0.2+03+0.4 
whence 21(A) ~ 0.005333 and yi(B) ~ 0.018667. 
e for m2(A)mi(B)m3(B) = 0.6 - 0.8 - 0.3 = 0.144 
06 08403 0.6+0.84 0.3 
whence x2(A) ~ 0.050824 and y2(B) ~ 0.093176. 
e for m3(A)m1(B)m2(B) = 0.7 - 0.8 - 0.4 = 0.224 
0.7 0.8+04 0.7+0.8+0.4 
whence x3(A) ~ 0.082526 and y3(B) ~ 0.141474. 











e for m1(B)m2(A)m3(A) = 0.8 - 0.6 - 0.7 = 0.336 
a4(A) _ ya(B) _ 0.336 
0.6+0.7 0.8  08+0.6+0.7 
whence 24(A) ~ 0.208000 and y4(B) 0.128000. 
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r5(A) _ ys(B) 
0.2+0.7 0.4 


o 0.056 
~ 0.4+0.2+0.7 


whence z5 (A) ~ 0.038769 and ys(B) ~ 0.017231. 
e for ma3(B)mı(A)m2(A) = 0.3 - 0.2 - 0.6 = 0.036 


ze(A) — v(B) 0.036 
0.2+0.6 03 03200406 


whence x¢(A) ~ 0.026182 and ye(B) ~ 0.009818. 








Therefore, with PCR6 one fnally gets 





6 
mPSR(A) = 0.084+ Ý- z:(A) = 0.495634 
4=1 
6 
mi53"(B) = 0.096 + X. yi(A) = 0.504366 
i=l 


whereas the Averaging Rule (2) will give us 








1.5 

mig (A) = = (0.2 + 0.6 + 0.7) = — = 0.5 
1 1.5 

ities B) = z (08+04+0.3) => =05 


In this example, the intersection of focal elements is empty 
but the bba’s to combine are not binary. Therefore the total 
conf ict between sources is not total and the theorem doesn’t 
apply and so PCR6 results differ from the Averaging Rule. 


It however can happen that in some very particular sym- 
metric cases PCR6 coincides with the Averaging Rule. For 
example, if we consider the bba’s as given in the Table VI. 
In such case the opinion of source #1 totally balances opinion 
of source #3, and the opinion of source #2 cannot support A 
more than B (and reciprocally), so that the fusion problem 
is totally symmetrical. In this example, it is expected that the 
fnal fusion result should commit an equal mass of belief to A 
and to B. And indeed, it can be easily verifed that one gets 
in such case 


Average 
mi93°(A) = M123 9°(A) = 0.5 
misi (B)=mi2s 9" (B) = 0.5 


which makes perfectly sense. Note that the Averaging Rule 
provides same result on example 4 which is somehow ques- 
tionable because example 4 doesn’t present an inherent sym- 
metrical structure. In our opinion PCR6 presents the advantage 
to respond more adequately to the change of inherent internal 
structure (asymmetry) of bba’s to combine, which is not well 
captured by the simple averaging fusion rule. 


Table VI. 


boas \ Focal elem | A | 2 J ANB=0] 
omit) oo o 
Los pos o] 


A BAYESIAN NON-BINARY SYMMETRIC EXAMPLE. 





O 
a a 
C mO [0 [oe] 0m 


IV. APPLICATION TO PROBABILITY ESTIMATION 


Let’s review a simple coin tossing random experiment. 
When we fip a coin [13], there are two possible outcomes. The 
coin could land showing a head (H) or a tail (T). The list of all 
possible outcomes is called the sample space and correspond 
to the frame © = {H,T}. There exist many interpretations 
of probability [14] that are out of the scope of this paper. We 
focus here on the estimation of the probability measure P(H) 
of a given coin (biased or not) based on n outcomes of a coin 
tossing experiment. The long-run frequentist interpretation of 
probability [15] considers that the probability of an event 
A is its relative frequency of occurrence over time after 
repeating the experiment a large number of times under similar 
circumstances, that is 


P(A) = lim ma 


noo n 


(12) 


where n(A) denotes the number of occurrences of an event 
A in n > 0 trials. In practice however, we usually estimate 
the probability of an event A based only on a limited number 
of data (observations) that are available, and so we estimate 
the idealistic P(A) defned in (12), by classical Laplace’s 
probability def nition 

n) = 4) (13) 

n 

Naturally, P(A) > 0 because n(A) > 0 and n > 0, and 
P(A) < 1 because we cannot get n(A) > n in a series of 
n trials. P(A) + P(A) = 1 because 4 xa + {A _ xa + 
roA = 1 where A is the complemen of Ai in the sample 
space. 


P(A|n(A), 


It is interesting to note that the classical estimation of the 
probability measure given by (13) corresponds in fact to the 
simple averaging fusion rule of distinct pieces of evidence 
represented by binary masses. For example, let’s take a coin 
and fip it n = 8 times and assume for instance that we observe 
the following series of outcomes {03 = H,o2 = H,03 = 
T,o4 = H,o5 = T,og = H,o7 = H,os = T}, so that 
n(H) = 5 and n(T) = 3. Then these observations can be 
associated with distinct sources of evidences providing to the 
following basic (binary) belief assignments: 


Table VII. 


OUTCOMES OF A COIN TOSSING EXPERIMENT. 


bba’s \ Focal elem. HIT] 





It is clear that the probability estimate in (13) equals the 
averaging fusion rule (2) and in such example because 





A n(H 5 
PUH|for,02,...,06}) = = 3 by eg. 113) 
1 
= 5(Lt+1+04+14+0+1+1+0) 
= mAverese(H) by eq. (2) 


gesag 
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F n(T 3 
P(T|{01, 02, rea ,08}) = mt) = 8 by eq. (13) 
1 
= 3040 t1+0+1+4+0+4+0+1) 
= myx g (T) by eq. (2) 


Because all the bba’s to combine here are binary and are in 
total confict, our theorem 1 of Section II applies, and PCR6 
fusion rule in this case coincides with the averaging fusion 
rule. Therefore we can use PCR6 to estimate the probabilities 
that the coin will land on H or T at the next toss given the 
series of observations. More precisely, 


mis Ris(H) = mias (H) = ÊH 01,02... ,08}) 
mI Se (T) = miz g (T) = P(T|{01,02, tee , 0g}) 


We must insist on the fact that Dempster-Shafer (DS) rule 
of combination (4) cannot be used at all in such very simple 
case to estimate correctly the probability measure because 
DS rule doesn’t work (because of division by zero) in total 
conf icting situations. PCR5 rule can be applied to combine 
these 8 bba’s but is unable to provide a consistent result with 
the classical probability estimates because one will get 


m1,2,...,8() = pte, 
14i 


TH YT 


= = 0.5 
Peisiisl Tedd Gt dey ti 





and therefore the PCRS5 fusion result is 
W = zy = 0.5 # (mi 97s (H) = 5/8) 


, 


8 
m5 Ps(T) = yr = 0.5 # (mr$*°s(T) = 3/8) 


Remark: The PCR6 fusion result is valid if and only if 
PCR6 rule is applied globally, and not sequentially. If PCR6 
is sequentially applied, it becomes equivalent with PCR5 
sequentially applied and it will generate incorrect results for 
combining s > 2 sources of evidence. Because of the ability 
of PCR6 to estimate frequentist probabilities in a random 
experiment, we strongly recommend PCR6 rather than PCR5 
as soon as s > 2 bba’s have to be combined altogether. 


V. CONCLUSIONS AND CHALLENGE 


In this paper, we have proved that PCR6 fusion rule 
coincides with the Averaging Rule when the bba’s to combine 
are binary and in total conf ict. Because of such nice property, 
PCR6 is able to provide a frequentist probability measure 
of any event occurring in a random experiment, contrariwise 
to other fusion rules like DS rule, PCRS rule, etc. Except 
the Averaging Rule of course since it is the basis of the 
frequentist probability interpretation. In a more general context 
with non-binary bba’s, PCR6 is quite complicate to apply to 
combine globally s > 2 sources of evidences, and a general 
recursive formula of PCR6 would be very convenient. It can 
be mathematically reformulated as follows: Let R be a fusion 
rule and assume we have s sources that provide M1, M2, ..., 
Ms—1, Ms respectively on a fusion space G®. Find a function 
(or an operator) T such that: T(R(m1, M2, ...Ms—1), Ms) = 
R(mı, M2,...,Ms—1,Ms), or by simplifying the notations 
T(Rs—-1, Ms) = Rs, where R; means the fusion rule R applied 
to 7 masses all together. For example, if R equals the Averaging 
Rule, the function T is defned according to the relation (3) 
by T(Rs_1,m;) = Re_1 + im, = R,, and if R equals 


S 
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DS rule one has T(Rs—-1, Ms) = DS(Rs—1, Ms) because of 
the associativity of DS rule. What is the T operator associated 
with PCR6? Such very important open challenging question is 
left for future research works. 
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Examples where the Conjunctive and Dempster’s 


Rules are Insensitive 
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Abstract—In this paper we present several counter-examples 
to the Conjunctive rule and to Dempster rule of combinations in 
information fusion. 


Keywords— conjunctive rule, Dempster rule, DSmT, counter- 
examples to Conjunctive rule, counter-examples to Dempster rule, 
information fusion 


I. 


In Counter-Examples to Dempster’s Rule of Combination 
{Ch. 5 of Advances and Applications to DSmT on Information 
Fusion, Vol. I, pp. 105-121, 2004} [1], J. Dezert, F. 
Smarandache, and M. Khoshnevisan have presented several 
classes of fusion problems which could not be directly 
approached by the classical mathematical theory of evidence, 
also known as Dempster-Shafer Theory (DST), either because 
Shafer’s model for the frame of discernment was impossible to 
obtain, or just because Dempster’s rule of combination failed 
to provide coherent results (or no result at all). We have 
showed and discussed the potentiality of the DSmT combined 
with its classical (or hybrid) rule of combination to attack 
these infinite classes of fusion problems. 

We have given general and concrete counter-examples for 
Bayesian and non-Bayesian cases. 


INTRODUCTION 


In this article we construct new classes where both the 
conjunctive and Dempster’s rule are insensitive. 


Il.  DEZERT-TCHAMOVA COUNTER-EXAMPLE 


In [2], J. Dezert and A. Tchamova have introduced for the 
first time the following counter-example with some 
generalizations. This first type of example has then been 
discussed in details in [3,4] to question the validity of 
foundations of Dempster-Shafer Theory (DST). In the next 
sections of this short paper, we provide more counter-examples 
extending this idea. Let the frame of discernment © = {A, B, 
C}, under Shafer’s model (i.e. all intersections are empty), and 
m,(.) and mz(.) be two independent sources of information that 
give the below masses: 


Originally published as Smarandache F., Kroumov V., Dezert J., Examples 
where the conjunctive and Dempster's rules are insensitive, Proc. of 2013 
International Conference on Advanced Mechatronic Systems, Luoyang, China, 
Sept. 25-27, 2013, and reprinted with permission. 
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Focal Elements A C AUB AUBUC 
mı a l-a 0 
mz 0 1-b;-b, b, bz 
Table 1 


where the parameters a, b;, b2 € [0,1], and b; +b; < 1. 


Applying the conjunctive rule, in order to combine 
mı ® m, = mp, one gets: 


m (A) = a(b; +b) (D) 
mı(C)=0 (2) 
mp(A VU B) = (1-a)(b;+b2) (3) 
m)x(AUBUC) =0 (4) 
and the conflicting mass 

m)() = ]-b;-b = Kj. (5) 


After normalizing by diving by /-K;. = b;+b) one gets 
Demspter’s rule result mps(.): 


m,,(A) E a(b +b,) E 





A= =m(A 
Mps(A) 1K, +8, a=m,(A) 
nei SO) ig i 
| Kp b, +b, 
(6) 


Counter-intuitively after combining two sources of 
information, m;(.) and m2(.), with Dempster’s rule, the result 
does not depend at all on m,(.). Therefore Dempster’s rule is 
insensitive to m2(.) no matter what the parameters a, b;, bz are 
equal to. 
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HI. FUSION SPACE 


In order to generalize this counter-example, let’s start by 
defining the fusion space. 


Let © be a frame of discernment formed by n singletons Ai, 
defined as: 


O = {p p27... Øn} N Z 2, (7) 
and its Super-Power Set (or fusion space): 
S? =@,U,0,0 (8) 


which means the set @ closed under union U , intersection 
N, and respectively complement ©. 


IV. ANOTHER CLASS OF COUNTER-EXAMPLES TO 


DEMPSTER’S RULE 
Let A;, Ax ..., Ap E S9 f, @}, for p > 1, such that 
Ai N Aj = @ for ij, where J, is the total ignorance 
(41I 4U ... I An), and @ is the empty set. 


Therefore each A,, for i€ {/, 2, ..., p}, can be either a singleton, 
or a partial ignorance (union of singletons), or an intersection 
of singletons, or any element from the Super-Power Set 


S° (except the total ignorance or the empty set), i.e. a general 
element in the set theory that is formed by the operators 


U,A,C. 


Let’s consider two sources m;(.) and m2(.) defined on S°: 


A; A; p t 
mı dı a2 ap 0 
m2 b b b l- p-b 


where of course all a; € /0, 1] and a; + az +... + ap = 1, 

also b and 2- p-b €/0, 1]. 

m;(.) can be Bayesian or non-Bayesian depending on the way 
we choose the focal elements A), A», ..., Áp 
We can make sure m(.) is not the uniform basic believe 
assignment by setting b £ 1- pb. 

Let’s use the conjunctive rule for m;(.) and m2(.): 

m)(A;) = m(Aj)m(Aj) + [m(A)m2(1) + m)(1)mxAj)] = arb + 
[a;(1-p-b) + 0-b] = a;(1-p-b+b), 

for all i E€ d, 

(9) 

It is interesting to finding out, according to the Conjunctive 
Rule, that the conflict of the above two sources does not 
depend on m,/(.) at all, but only on m2(.), which is abnormal: 


K= YY (4m4) = SY arb = Yip —lai-b =(p -D$ a =(p-l)b. 


i=l j=l i=l j=l 
jai jei 


2, p}. 


sees 


(10) 
Therefore even the feasibility of the Conjunctive Rule is 
questioned. 
When we normalize, as in Dempster’s Rule, by dividing all 
m,2(.) masses by the common factor /-K = J-p-b+b, we 


212 


actually get: mı @m, = m; ! So, m2(.) makes no impact on 
the fusion result according to Dempster’s Rule, which is not 
normal. 


V. MORE GENERAL CLASS OF COUNTER-EXAMPLES TO 


DEMPSTER’S RULE 


Let’s consider r+/ sources: the previous m;(.) and 
respectively various versions of the previous m2(.): 


A; A) A, I, 
mı dı d2 ap 
M21 b; b; bl l- p-b; 
M22 by b> b> l- p-b 
m» b, b, b, l-pb, 


where of course all a; € /0, 1] and a; + az + ... +a, = 1, 
also all b; and 27- p-b; €/0, 1], forje {1, 2, ..., r}. 


(ID) 
Now, if we combine m; ® m ® my ® ... Ð m, = m. 
Therefore all r sources mz;(.), m22(.), ..., Mm2(.) have no impact 


on the fusion result! 
Interesting particular examples can be found in this case. 


VI. SHORT GENERALIZATION OF DEZERT-TCHAMOVA 


COUNTER-EXAMPLE 


Let’s consider four focal elements A, B;, B2, B3, such that 
AB; = ¢ fori € {1,2,3}, and B,, B, B; are nested, i.e. By c 
B, CB, and two masses, where of course 
bı+b, = ] and c;ı+c3+c; = 1, and all b;, bz C}, cz, c3 € [0, 1]: 


A B; B, B; 
mı 0 bı bz 0 
m2 C] 0 C2 C3 
M12 0 b,(1-c)) b2(1-c) 0 
and the conflict Kj. = c)(b;+b2)=c; 
Mp 0 bı bz 0 


a) This generalization permits the usefulness of hybrid 
models, for example one may have the frame of 
discernment of exclusive elements {A, B, C}, where 
Bı = BAC, B: =B, and B; = BU C. 


b) Other interesting particular cases may be derived 
from this short generalization. 
VII. PARTICULAR COUNTER-EXAMPLE TO THE CONJUNCTIVE 


RULE AND DEMPSTER’S RULE 
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For example let ©- fA, B, C}, in Shafer's model. We 
show that the conflicts between sources are not correctly 
reflected by the conjunctive rule, and that a certain non- 
vacuous non-uniform source is ignored by Dempster’s rule. 


Let's consider the masses: 


A B C AUBUC 
m; I 0 0 0 (the most specific mass) 
m 13 13 18 0 (very unspecific mass) 
m 0.6 04 0 0 (mass between the very 


unspecific and the most specific masses) 
mo 0.2 0.2 0.2 0.4 (not vacuous mass, not 
uniform mass) 


Then the conflict Kj) = 0.4 between m;(.) and mo/(.) is the same 
as the conflict Kz between m2(.) and mo/(.), and similarly the 
same as the conflict K3y between m3(.) and mo(.), 

which is not normal, since m;(.) is the most specific mass 
while m2/(.) is the most unspecific mass. 


Let's check other thing combining two sources using 
Dempster’s rule: 

mı ® MoM), M2 ® Mo=M2, M3 ® Mo-M3, 

which is not normal. 


In order to get the "normal behavior" we combine m/(.) and 
mo(.) with PCRS, and similarly for others: m,(.) combined 
with m,(.), and m3(.) combined with m,(.). 

In order to know what should have been the "normal behavior" 
for the conflict (the initial conflict was K; = 0.4), let's make a 
small change to mo/(.) as below: 


A B C AUBUC 
m; 1l 0 0 0 (the most specific mass) 
m 13 13 18 0 (very unspecific mass) 
m, 0.6 0.4 0 0 (mass between the very 


unspecific and the most specific masses) 
m 0.3 0.2 01 0.4 (not vacuous mass, not 
uniform mass) 


Kio = 0.30 
K = 0.40 
K3 = 0.34 


Now, the conflicts are different. 
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VIII. CONCLUSION 


We showed in this paper that: first the conflict was the 
same, no matter what was one of the sources (and it is 
abnormal that a non-vacuous non-uniform source has no 
impact on the conflict), and second that the result using 
Dempster’s rule is not all affected by a non-vacuous non- 
uniform source of information. 

Normally, the most specific mass (bba) should dominate the 
fusion result. 

Therefore, the conflicts between sources are not correctly 
reflected by the conjunctive rule, and certain non-vacuous 
non-uniform sources are ignored by Dempster’s rule in the 
fusion process. 
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Abstract The Uncertainty Representation and Reasoning 
Evaluation Framework (URREF) includes an ontology that 
represents concepts and criteria needed to evaluate the uncertainty 
management aspects of a fusion system. The URREF ontology defines 
self-confidence as a measure of the information credibility as 
evaluated by the sensor itself: The concept of confidence, which is not 
explicitly defined in the ontology at URREF, has been extensively 
explored in the literature about evaluation in information fusion 
systems (IFS). In this paper, we provide a discussion on confidence 
as it relates to the evaluation of IFS, and compare it with the existing 
concepts in the URREF ontology. Our goal is two-fold, since we 
address both the distinctions between confidence and self-confidence, 
as well as the implications of these differences when evaluating the 
impact of uncertainty to the decision-making processes supported byt 
the IFS. We illustrate the discussion with an example of decision 
making that involves signal detection theory, confusion matrix fusion, 
subjective logic, and proportional conflict redistribution. We argue 
that uncertainty can be minimized through confidence (information 
evidence) and self-confidence (source agent) processing, The results 
here seek to enrich the ongoing discussion at the ISIF’s Evaluation of 
Techniques for Uncertainty Representation Working Group 
(ETURWG) on self-confidence and trust in information fusion 
systems design. 


Keywords: Self-Confidence, Confidence, Trust, Level 5 Fusion, High- 
Level Information Fusion , PCR5/6, Subjective Logic 


I. 


Information fusion aims to achieve uncertainty reduction 
through combining information from multiple complementary 
sources. The International Society of Information Fusion 
(SIF) Evaluation of Techniques of Uncertainty Reasoning 
Working Group (ETURWG) was chartered to address the 
problem of evaluating fusion systems’ approaches to 
representing and reasoning with uncertainty. The working 
group developed the Uncertainty Representation and 
Reasoning Evaluation Framework (URREF) [1]. Discussions 
during 2013 explored the notions of credibility and reliability 
[2]. One recent issue is the difference between confidence and 
self-confidence as related to the data, source, and processing. 
While agreement is not complete among the ETURWG, this 
paper seeks to provide one possible approach to relate the 
mathematical, semantic, and theoretical challenges of 
confidence analysis. 


INTRODUCTION 


The key position of the paper is to analyze the practical 
differences in evaluating the two concepts. More specifically, 
self-confidence is mostly relevant to HUMINT, which makes 
its evaluation a primarily subjective; whereas confidence can 
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be easily traced to machine data analysis, allowing for the use 
of objective metrics in its evaluation. That is, a machine can 
process large amounts of data to represent the state of the 
world, and the evaluation of how well uncertainty is captured 
in these processes can be traced to various objective metrics. 
In contrast, for a human to assess its own confidence on the 
credibility of his “data collection process” (i.e. self- 
confidence), he or she has to make a judgment on limited 
choices. Objective assessment is determined from the 
credibility of the reports, processing, and decisions. Typical 
approaches include artificial intelligence (AI) methods (e.g., 
Neural Networks), pattern recognition (e.g., Bayesian, 
wavelets), and automatic target exploitation (i.e., over sensor, 
target, and environment operating conditions [3]). Subjective 
analysis is a report quality opinion that factors in analysis 
(e.g., completeness; accuracy, and veracity), knowledge (e.g., 
representation, uncertainty, and reasoning), and judgment 
(e.g., intuition, experience, decision making) [4]. In terms of 
IFS support for decision-making, numerous methods have 
been explored, mainly from Bayesian reasoning, Dempster- 
Shafer Theory [5], Subjective Logic [6], DSmT [7], fuzzy 
logic and possibility theory; although it also includes research 
on approximating belief functions to subjective probability 
measures (BetP [8], DSmP [9]). 


Figure 1 provides a framework for our discussion. The world 
contains some truth 7, of which data is provided from different 
sources (A, B). Source A analysis goes to a machine agent for 
information fusion processing while source B goes to a human 
agent. Either the machine or the human can generate beliefs 
about the state of the world (either using qualitative or 
quantitative semantics). The combination of A and B is a 
subject of Level 5 Fusion (user refinement) [10, 11]. 
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Figure 1 — Information Fusion System assessment of confidence (machine) 
and self-confidence (sensor or human agent). 
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On one hand, confidence is typically related to machine 
processing such as signal detection theory where self- 
confidence is associated with sensors (and humans) assessing 
their own capability. On the other hand, the manipulation of 
the data requires understanding of the source, and self- 
confidence is applicable for the cases in which the user can 
provide a self-assessment on how confident he is on its data. It 
is important to emphasize that are not dealing (at least not 
directly) with information veracity: even if the sensor (e.g., a 
human reporting on an event or providing assessment on a 
situation) considers the information as possible, and he trusts 
it, it could be false at the end (e.g., even in summer time we 
can have a cloudy day). That is, self-confidence assesses how 
much the author trust the information, but not necessarily that 
this information is false or true [1]. For this paper, we take the 
URREF ontology definition of self-confidence as implied in 
Figure 1. The rationale for this choice is that self-confidence 
and uncertainty are typically associated with humans whereas 
confidence has been typically used in signal detection. Fusion 
of beliefs ultimately relates to states of the world with a 
reported confidence that can be compared to a truth state. 
Debating on the overlaps in terminology would be welcomed 
to clarify these positions for the ETURWG and the 
information community as a whole. 


From Figure 1, we note the importance of confidence as 
related from a decision to the estimated states. Self-confidence 
is within the human agent assessing their understanding (e.g. 
experience) that can also be combined with the computer 
agent. The issue at hand for a user is whether or not the 
machine analysis (or their own) state decision represents 
reality. The notion of reality comes from the fact that 
currently, there are many technical products that perceive the 
world for the user (e.g., video) from which the user must map 
a mental model to a physical model of what is being 
represented. Some cases are easy such as video of cars moving 
on a road [12]; however, others are complex such as cyber 
networks [13]. The example used through the rest of the paper 
requires High-Level Information Fusion (HLIF) of target 
detection from a machine and human [14, 15]. 


In designing computer-aided detection machines, it is 
desirable to provide intelligence amplification (IA) [16] where 
Qualia motivates subjective analysis as a relation between the 
human consciousness/self-awareness to external stimuli. 
Qualia is the internal perception of the subjective aspect of the 
human’s perception of the stimuli. Knowing oneself can then 
be utilized to understand/evaluate the use of meaningful and 
relevant data in decision-making. The more that a sensor 
understands its Qualia [17], the better it will be in providing an 
assessment of its self-confidence in a report or on a decision. 
Qualia then encompasses an important component to 
uncertainty reasoning associated with subjective beliefs, trust, 
and self-confidence in decision making as a sense of intuition. 
Not surprisingly, these are natural discussion topics in Level 5 
fusion (‘user refinement’), which includes operator-machine 
collaboration [18], situation awareness/assessment displays 
[19], and trust [20]. In order to explore self-confidence on 
these issues, we need to look at the psychology literature on 
trust as it relates to self-confidence. 
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From data available on the web (e.g., twitter, documents), 
intelligent users need the capability to rapidly monitor and 
analyze event information over massive amounts of 
unstructured textual data [21]. Text from human sources is 
subjected to opinions, beliefs, and misperceptions, generating 
various forms of self-assumed self-confidence. In contrast, 
computer sensed data can be stochastic or deterministic, from 
which we have to coordinate the agent information. For 
example, with Gaussian observations generates stochastic 
probability analyses (e.g., Kalman Filter). However, structural 
information in the sensor models and sensitivities for a given 
state condition (which come from a deterministic ontology) 
could be used to improve the estimate [22]. This combination 
of both stochastic and deterministic decisions with uncertainty 
elements is usual in modeling and system deployment, and 
understanding its key aspects is a fertile area for producing 
better decision support from IFS. 


A related example from the analysis of uncertainty is evidence 
assessment from opinion makers. Dempster-Shafer theory has 


been used in connection with Bayesian analysis for 
decision making [5]. 
Likewise, Jøsang [23] demonstrated how subjective 


analysis within Dempster-Shafer theory could be used to 
determine the weight of opinions. Ontologies such as the 
one used in the URREF must be able to account for the 
uncertainty of data and to model it qualitatively, semantically, 
and quantitatively [24]. Metrics such as quality of service 
(QoS) and quality or information (IQ) are example of tools 
that can support and enhance a modeling capability between 
ontologies and uncertainty analysis [25]. The rest of this 
paper includes Sect. II as an overview of self-confidence. 
Sect. III discusses the mathematical analysis. Sect. IV 
highlights subjective logic for opinion making. Sect. V is an 
example and Sect. VI provides conclusions. 


II. URREF NOTIONS OF SELF-CONFIDENCE 


The ETURWG has explored many topics as related to a 
systems analysis of information fusion, which includes 
characteristics of uncertainty with many unknowns [26, 27]. In 
this paper, we categorize the characteristics of uncertainty into 
four areas, shown in Table 1. Assuming that the flow of 
information first goes from an agent to evidence beliefs, and 
subsequently to fusion with knowledge representation, then 
these areas help understand the terminology. Note that the 
defined information fusion quality of service (QoS) 
parameters are in blue {timeliness, accuracy, throughput, and 
confidence}. These could also be measures of performance 
[28]. For measures of effectiveness [25], one needs to 
understand system robustness (e.g., consistency, completeness, 
correctness, integrity). Here we focus on the red terms as 
related to self-confidence and confidence. 


Knowledge representation in IFS [29, 30] includes applying 
decision-making semantics to support the structuring of 
extracted information. One example is the use of well defined 
concepts (e.g. confirmed, probable, possible, doubtful, and 
improbable) to support information extraction with natural 
language processing (NLP) algorithms. As related to 
confidence and self-confidence, there is the notion of integrity. 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


Table 1: Characteristics of Uncertainty * 


















































Agent Evidence Algorithm Representation 
Source Information Fusion Knowledge Reasoning 
Scalability Knowledge Handling 
Objectivity Relevance Computational Cost Simplicity 
Observational Sensitivity | Conclusiveness Adaptability Expressiveness 
Veracity (truthfulness) Veracity (truth) Traceability (pedigree) | Polarity 
Secure Ambiguity Stability Modality 
Resilient Genericity 
Trust Precision Throughput Tense 
Accurac Timeliness 
Reliability | Credibility Correctness Completeness 
Self-Confidence Consistency Integrity 














This table is presented to the ETURWG in this paper to support ongoing discussions on the categorization of types of uncertainty 


Integrity for human agents is associated with their subjective 
accountability and consistency in making judgments. Integrity 
for a machine could be objective in the faithful representation 
and validity on the data [31]. 


Algorithm performance focus on the information fusion 
method. URREF criteria for evaluating it relates to how the 
uncertainty model performs operations with information. An 
example of related metrics is to assess uncertainty reduction 
by weighting good data over bad given conflicting data. 


Evidence: From [2], we explored the weight of evidence 
(WOE) as a function of reliability, credibility, relevance, and 
completeness. In URREF, WOE assesses how well an 
uncertainty representation technique captures the impact of an 
input affecting the processing and output of the IFS. 


Source: Self-confidence, while yet to have a clear definition in 
the engineering literature, is typically associated with trust. 


A. Trust 


Closely associated with subjective analysis is trust [32]. Trust 
includes many attributes for man-machine systems such as 
dependability (machine), competence (user), and application 
[33]. Trust is then related to machine processing (confidence) 
and human assessment (self-confidence). Trust in automation 
is a key attribute associated with machine-driven solutions. 
Human trust in automation determines a user’s reliance on 
automation. In [32], they explored self-confidence defined as 
the user anticipatory (or post) performance with machines 
which impacts with trust in policy application. 


Measuring trust as related to uncertainty is an open topic [34]. 
As a focus of discussion, we have a machine agent and a 
human agent of which a measure of trust comes from the 
uncertainty associated between the man-machine interactions. 
Reliability trust could be between human agents of which 
subjective probability is useful [35]. Decision trust could be 
between human agents or between a human and a machine and 
takes into account the risk associated with situation-dependent 
attitudes, attention, and workload of a human agent. The 
distinction between reliability and decision trust is important 
as related to self-confidence and confidence. This can be seen 
in Table 2, which depicts the main aspects for each of the six 
potential interactions between sensors.. 
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Table 2: Trust Aspects in Sensor Interactions 


























Human Others Machine 
Human Self-confidence | Reliability | Trustworthy 
Machine Trust Credibility | Confidence 





e Human: Individuals must provide introspection on their 
own analysis and interaction with a machine. Here we 
distinguish between self-confidence and trust. In this 
case, human agents must have self-confidence in 
themselves as well as trust in the machine. 

e Others: With the explosion of the Internet, recent work has 
explored the uncertainty of human sensing, such as 
Twitter reports in social networks, showing humans as 
less calibrated and reliable in their sensing. Wang et al. 
[36, 37] developed an estimation approach for truth 
discovery in this domain. Another recent example 
explored the decision-making trust between humans 
interfacing through a machine. The user interface was 
shown to have a strong impact on trust, cooperation, and 
situation awareness [38]. As an interesting result, 
credibility resulted as the computer interaction afforded 
complete and incomplete information towards 
understanding both the machine and the user analysis. 

e Machine: A large body of literature is devoted to network 
trust. Examples include the hardware, cyber networks 
[39], protocols and policies. Given the large amount of 
cyber attacks written by hackers, it comes down to a 
trustworthy network of confidentiality, integrity, and 
availability. For machine-machine processing without 
user-created malware, network engineering analysis is 
mostly one of confidence. Machine trust is also 
important to enterprise systems [40]. 


Since we seek to understand self-confidence as a URREF 
criterion, explorations included human processing and the 
human as a data source as shown in Figure 3. 

t 












Network Machine Trust: Software Trust: Algorithm Trust: ; 
Trust = 1 
Sensor Detection Confidertce 
Real 7 7 Decision 
viona | —{ Cone 
Application Decision (perceived) Represents Reality / 


Trust: 


User Trust: 


Modeling Trust: Trust Boundary 
Figure 2 — Methods of Trust. 
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In the figure, multiple forms of trust are shown and related to 
the processing steps. Starting from the real world, data is 
placed on the network from which a machine (or sensor) 
processes the data. With context, a detection assessment is 
made for such domains as image, text, or cyber processing. 
That detection is the fused with the context information. For 
example, a detection of an object in an image is fused with 
contextual road information. The detection confidence is 
assessed and made available to the user with the context. The 
green line is the human-machine trust boundary as the human 
can look at the machine results or process the data themselves 
for given a level of confidence and render a decision. The 
dotted line then is a human assessment of whether or not the 
information presented represents reality and could be trusted. 


Note, if the human is the only sensor source, then he/she is 
looking at data and making a decision. Their self-confidence 
could be based on the machine results from which they factor 
in many types of trust. For example, context, as related to the 
real world (see Figure 1), provides a validation of the machine 
(network to algorithm trust as a measure of confidence), while 
at the same time understanding the situation to determine if the 
information fusion analysis is providing meaningful and useful 
information towards the application of interest. Together a 
trusted decision is rendered based on the many factors. 


Included in Figure 3 are many forms of trust in the analysis all 
of which can lead to confidence in the decision: 






































Trust Processing Example 
Network Data put on a | Assessment of data timeliness and 
network lost packets 
Machine Sensor Calibration of cameras for image 
transformation content 
Software Information Getting the correct data from a 
management data base (e.g., a priori data) 
Algorithm Fusion method Target tracking and classification 
results 
Modeling State models Kinematic and target recognition 
models (e.g., training data) 
Application | Situation of interest | Analysis over the correct area 
(e.g. target moving on a road) 
User Situation awareness | Use of cultural and behavior (e.g. 
assume big cars move on roads). 





The self-confidence of the user analysis includes working with 
data, networks, and machines. The URREF ontology must 
account for trust over human-machine decisions for 
confidence analysis. To further explore how URREF is 
aligned with trust, we must look at self-confidence. 


B. Self-Confidence 


Statistically speaking, the machine decision-making accuracy 
is based on the data available, the model chosen, and the 
estimation uncertainty associated with the measured data. 
Given the above analysis, we could start to derive self- 
confidence for machine fusion operations based on the 
literature in human self-confidence. 


Self-confidence is the socio-psychological concept related to 
self-assuredness in one's personal judgment and ability. As an 
example, researchers are often called to review papers and 
after their review asked to give a quality rating of their own 
review based on their understanding of the subject, expertise, 
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and experience. In another example, a person might be asked 
to identify an object in an image with a certain rating 
{unlikely, possible, probable, confirm} from which then they 
could determine self-confidence based on their answer. Thus, 
there is a need to assess “self-confidence” in relation to 
“confidence”, which is linked to uncertainty measures of trust. 


C. Accuracy and Precision 


Self confidence is strongly related to both precision and 
accuracy. A source can be self confident in both the precision 
of its generated data (consistency or variability in its reports - 
such as reported variance) as well as the accuracy of its reports 
(the reported bias or the reported distance of the mean value of 
the generated data form true value). In other words, to make 
sense of the term the self-confidence of a source, the data 
encapsulate a combination of precision and accuracy. A 
distinction is made between precision and accuracy reported 
by the machine (such as the estimated mean and variance at 
the output of the Kalman filter) and the actual precision and 
accuracy of the data emanating from the source. The URREF 
ontology categorizes accuracy, precision, and self-confidence 
as types of criteria to evaluate data [1]. 


Statistical methods of uncertainty analysis from measurement 
systems include accuracy and precision, shown in Figure 4. 
The use of distance metrics (accuracy) and precision metrics 
(standard deviations) help to analyze whether the 
measurement is calibrated and repeatable. We would desire 
the same analysis for human semantic analysis with precise 
meanings, consistent understanding, and accurate terminology. 
Value 
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Figure 3 — Uncertainty as a function of accuracy and precision. 


Human Confidence-Accuracy: Traditionally known as the 
confidence-accuracy (CA) relationship, the assumption is that 
as one’s confidence increases so does their level of accuracy 
which is affected by memory, consistency, and ability [41]. 
Issues include absolute versus relative assessment, feedback, 
and performance. 


The confidence-accuracy relationship was shown to be a by- 
product of the consistency-correctness relationship: It is 
positive because the answers that are consistently chosen are 
generally correct, but negative when the wrong answers tend 
to be favored. The overconfidence bias stems from the 
reliability-validity discrepancy: Confidence monitors 
reliability (or self-consistency), but its accuracy is evaluated in 
calibration studies against correctness. Also, the response 
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speed is a frugal cue for self-consistency and depends on the 
validity of self-consistency in predicting performance [42]. 


Koriat [42] explains that Sensing tasks are dominated by 
Thurstonian uncertainty (local rank ordering with stochastic 
noise) within an individual and exhibit an under-confidence 
bias. However, general knowledge tasks are dominated by 
Brunswikian Uncertainty (global probabilistic model from 
limited sample sets to infer general knowledge [43]) that 
supports inter-person ecological relations. 


Consistency is then the repeatability of the information, which 
should imply no conflicts in decision-making. We can use the 
proportional conflict redistribution (PCR6) to get a measure of 
repeated consistency such that favored wrong answers are 
corrected in confidence analysis [44]. PCR6 is more general 
and efficient than PCRS when combining more than two 
sources altogether. Moreover, PCR6 has been proved 
compatible with frequency probabilities when working with 
binary BBA's, whereas PCRS and DS are not compatible with 
frequency probabilities [44]. 


Self-confidence could be measured with a Receiver Operating 
Characteristic (ROC) curve as once a decision can be made, 
we can then assess its impact on confidence. A low self- 
confidence would lead to chance, and a high self-confidence 
would remain to the left on the ROC. 


Il. 


Signal detection theory provides a measure of confidence in 
decision making that by assuming a limited hypothesis set is 
actually a measure of self-confidence. One classic example is 
Wald’s Sequential Probability Ratio Test (SPRT) [45]. 
Assuming evidence is sampled at discrete time intervals, then 
the human or cognitive agent compares the conditional 
probabilities x(¢ + At) for two hypothesis H ; (j = 1, 2). Using 
J 1 [x ( t ) | 


then SPRT, then 
yo-a = LN | BE | 0) 


If y(t) > 0, then evidence supports H ,, and if y(t) < 0, then H 2 
is more likely. As time accumulates for decision making, there 
is an aggregation of the log likelihood ratios: 


fı [x(t + Ad] | 
fa [x(t + AD] 


where, for a stochastic system L(t) ~ M(t), o”). Eq (2) can 
be written in Bayesian log odds: 


SELF-CONFIDENCE 


L(t+ At) = LY) + ix] (2) 


w| al = l oa ol as] © 


One then collects information to make a decision such that —0, 
< L(t) < 6. The chosen threshold is then a measure of a 
decision, which can be conservative or aggressive for the case 
of a human agent [46]. Figure 5 shows the case in which 
evidence is accumulated and a decision is made with 
associated standard boundaries for semantic decision making. 
Also in Figure 5 we related decision boundaries for semantic 
confidence classification [47]. 
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It is noted that a choice in time is not just the product of the 
current analysis, but the accumulated evidence. For example, 
in Figure 5, we see that the signal is moving between semantic 
boundaries from doubtful to probable, with an associated 
measure label of possible. Given the history, then the decision 
maker could be self-confident in the current measurement 
given their perception of the entire processing of machine 
decision making measures for each time. 


A Piercian hypothesis [48] implies confidence is a 
multiplicative function of the quantity of the information 
needed to make a decision (G or the distance traveled by the 
diffusion process) and the quality of the information (6; or the 
rate of evidence accumulation in the diffusion process) 
accumulated in Dynamic Signal Detection [48]. Without bias, 
the authors of [48] show that: 


orei -e Gual aj- 


where £ is a scaling parameter. A decision, @ is related to a 
response (R) of detection to a stimuli (S). Given the ability to 
model self-confidence as a measure of precision, we extend 
the methodology using subjective-logic and DSmT [44] for 
robust decision making. 
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Figure 4 — Evidence Accumulation for Decision Confidence. 


IV. 


Subjective opinions [49] are special cases of belief functions 
as they correspond to bba defined on 2D frames of type 0 = 
{A, ~A} assuming Shafer’s model or DSmT. Subject opinions 
lend themselves to simple mathematical expressions of fusion 
models. We therefore use the opinion representation for 
describing the various fusion models, but the expressions can 
easily be mapped to traditional belief functions. 


SUBJECTIVE OPINONS 


A subjective opinion expresses belief about statements in a 
frame. Let X be a frame of cardinality « An opinion 
distributes belief mass over the reduced powerset R(X) of 
cardinality x. The reduced powerset R(X) is defined as: 


R (X)= P(X) | {% 2}, (5) 


where P(X) = 2“ denotes the powerset of X. All proper subsets 
of X are elements of R(X), but the frame {X} and empty set 
{Ø} are not elements of R (X). 
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Let Px be a belief vector over the elements of R(X), uybe the 


; > 
complementary uncertainty mass, and a be a base rate vector 
over X. Whenever relevant, a superscript such as A denotes the 


opinion owner. Then a subjective opinion œx is the composite 
function expressed as: 


> > 
wy =(Bx,ux dx). (6) 
The attribute A is thus the belief source, and X is the target 
frame. The belief, uncertainty and base rate parameters satisfy 
the following additivity constraints. 


e Belief additivity: 
=> 
uxt >» bx) =L where x € R(X) (7) 
xj E€ R(X) 
e Base rate additivity: 
ees 
È axa) =L where x E€ X (8) 


i=l 
The belief vector Px has « = (2*- 2) parameters, whereas the 
> a 
base rate vector a x only has k parameters. The uncertainty 
parameter uy is a simple scalar. A general opinion thus 
contains (2 +k-1) parameters. However, given that Eq.(7) and 
Eq.(8) remove one degree of freedom each, opinions over a 


frame of cardinality k only have (2‘+k-3) degrees of freedom. 
The probability projection of hyper opinions is the vector 


denoted as Ey : 


Ex = 2 ax (x; | x) Bx (x) + ax (xj)ux, Yx ERX (9) 


xj R(X) 
—-> 
ax Nx; 
where Tx (x; |x) - en) Vx x; CX, (10) 
a x (xj) 


denotes relative base rate, i.e. the base rate of subset x; relative 
to the base rate of (partially) overlapping subset x;. 


General opinions are also called hyper opinions. A 
multinomial opinion is when belief mass only applies to 
singleton statements in the frame. A binomial opinion is when 
the frame is binary. A dogmatic opinion is an opinion without 
uncertainty, i.e. where u = 0. A vacuous opinion is an opinion 
that only contains uncertainty, i.e. where u = 1. Likewise, we 
can make the case that confidence in the opinion is biased by a 
subjective opinion of the source self-confidence. Thus, self- 
confidence is SCy = 1 owing to rank-order decision-making on 
a subset of the world, and the lack of self-confidence is SCy = 
0; where: 


—-> 
SCy (0%) = dx[l- ux] 


(11) 
(12) 


Equivalent probabilistic representations of opinions, e.g. as a 
Beta pdf (probability density function) in case of binomial 
opinions, as a Dirichlet pdf in case of multinomial opinions, or 
as a hyper Dirichlet pdf in case of hyper opinions offer an 


and oå & SCy (4) e Bx 
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alternative interpretation of subjective opinions in terms of 
traditional statistics [6]. 


Cumulative Fusion: 

The cumulative fusion rule is equivalent to a posteriori 
updating of Dirichlet distributions. Its derivation is based on 
the bijective mapping between the belief and evidence 
notations described in [6]. 


The symbol “0” denotes the cumulative fusion of two 
observers A and B into a single imaginary observer AQB. 


Let œ“ and œ” be opinions respectively held by agents A and B 
over the same frame X of cardinality k with reduced powerset 
R (X) of cardinality x. Let œ?” be the opinion where: 


CASEI: Foru’#0 v uř+0 (with Confidence) 


x bix) u + b*(xi) uA 








AOB 
a F u+ u? -ul u’ 
foe (13) 
48 zoe UUE 
wit u? — u u” 
CASE I: Forw4=0 v uř+0 (without Confidence) 
bE) =y ba) +77 bE) 
(14) 
uf? =0 
B 
A_ Lim 2 
Y ees 0; is 0 u T u” 
where: A 
ma Lim a 
u> 0; PR uw +u 


Note: the case without confidence averages the results from 
self-confidence reports which weights effectively both the 
same. Confidence allows the user to weight the self- 
confidence of the reports based on the Brunswikian 
uncertainty about the world knowledge. 


Then œf °” is the cumulatively fused opinion of œf and œ”, 
representing the combination of independent opinions of A and 
B. By using the symbol ‘@’ to designate this belief operator, 
cumulative fusion is expressed as: 


. ‘ «| AOB A B 
Cumulative Belief Fusion: o, o © oy (15) 
The cumulative fusion operator is commutative, associative 
and non-idempotent. In Eq.(15), the associativity depends on 
the preservation of relative weights of intermediate results 
through the weight variable y, in which case the cumulative 


tule is equivalent to the weighted average of probabilities. 


V. RESULTS 


Assume we have two agent opinion makers œf and œ”, who 
each make a decision for network security [50]. Let œ“ be a 
machine Algorithm and let œ” come from a human Being. 
After reporting their opinion, œ” is asked for their self- 


confidence. The result modifies their belief a ae such that the 
cumulative belief fusion product is a weighted function of 
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their self-confidence (source) over their confidence (data). 
Figure 6 provides a perspective of the analysis. 
SPRT 
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Detection 
Likelihood 
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oF 









Decision 







Self-Confidence 





Decision (perceived) Represents Reality 





Subjective Logic CM Fusion 
Figure 5 — Analysis With and Without Self Confidence. 


For many situations, a machine can process large amounts of 
data, while a human agent can only comprehend a subset of 
the data. Thus, a machine processes the data as outputs to a 
user. The interaction with the user is continually updated and 
a decision from the user is required. For situations in which 
the user has more time (forensics), then his/her self-confidence 
in the data would be high. For quick decisions, an observe- 
orient-decide-act (OODA) decision might be required [51] 
which reduces self-confidence. We seek methods of the latter 
as uncertainty is higher in rapid decision making which is a 
subset of problems in the Dynamic Data-Driven Application 
Systems (DDDAS) paradigm [52, 53]. 


For the analysis, we have two opinion makers (machine and 
man). Using signal detection theory, their individual measures 
of analysis provide a likelihood function. We then fuse the 
results with confusion matrix fusion [54] as a method of 
combination using Bayesian, Dempster-Shafer, or DSmT 
results [55]. We utilize two cases in which there is a high and 
low-confident observer (Case 1) and then the situation in 
which both have comparable analysis (Case 2). With two 
highly self-confident observers (Case 2), the results are similar 
to one of the observers which could be used for opinion 
validity. However, the user could be looking at the results and 
further analyzing context to provide a more appropriate 
analysis of their decision (e.g., based on culture, data 
completeness, etc). Using subjective logic the human being 
could modify their opinion, œ”sc, which results in a larger 
value (e.g., know something) or lower value (e.g., recognize 
limitation of analysis). 


We assume that if the user provides no assessment of self- 
confidence, we provide equal weight to the results (average 
fusion). On the other hand, if a machine provides a measure 
of confidence, it could be derived from the dynamic-data, 
which we don’t simulate here. 


Example (High self-confidence with low self-confidence) 
Assume that we have a highly self-confident opinion maker, 
œf, that includes many sources and reliable analysis. On the 
other hand, we have a low-confident opinion maker, o”, who 
is making a decision. When making their decision, œ” is 
guessing or almost chance, assuming that context provides 
pragmatic understanding of the world events. 


In Figure 7, there are two opinion makers, the red curve of a 
human agent suggesting that the result is “improbable,” while 
the more self-confident is in blue reporting “probable”. The 
fused result, shown in green, using self-confidence better 
reflects the true state; versus the average fusion of the opinion 
makers shown in magenta. The key issue is that self- 
confidence can help weight evidence. 
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Figure 6 — DS With (Fused) and With-out (Ave) Self-Confidence. 


Exploring DSmT [44], using the proportional conflict 
redistribution rule (PCR6'), we also see in Figure 8 an 
improvement in the belief confidence when self-confidence is 
accounted for. 

Estimation of Credibility Fused - PCR5 

































beh) 7 1 x X gN 3 ie d | 
0.9} | iR Y IES 
0.8 Hex | | Y y o| 
4, 
j | f 
0.7 > | | | 4 
M | | 
| ni k 
0.6 F | | | | 4 
5 f Hl 
z 0.5} nes) | Wy, T 
Re RR 
0.4. | { | 
Sl] 
” pe Ve ANE” 1 Ml y U, mg 
0.3 + WaS ji T 
o | || | —— Ground truth 
ol : | i| —— PCR5 Probable 
| | PCR5 ImProbable 
PCR5 Fused H 
~~ PCR5 Ave 
1 firen 1 

















60 80 


Scan number 


Figure 7 — PCR6 With (Fused) and With-out (Ave) Self-Confidence. 
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VI. 


In this paper, we assessed self-confidence as a criterion in the 
URREF. Self-confidence is typically associated with a source 
and relates a subjective quality on the rendering of their beliefs 
over data. For stochastic observations, we use the SPRT in a 
self-confidence analysis. However, to get the case of partial 
information, we use subjective logic for decision-makers. We 
demonstrated that the PCR6 is superior to DS for decision for 
a scenario in which a high self-confident observer opinion is 
fused with a low self-confident observer. Ultimately it is the 
user trust in the data they have available and opinions towards 
self-confidence; whereas a machine only reports confidence. 


CONCLUSIONS 


Further directions include using the analysis with real 
operators doing intelligence analysis over data and associating 
semantic boundaries to their subjective decision-making. 





' In the scenario, we used sequential fusion of two sources and because of this, 
PCRS5=PCR6, i.e. when combining 2 sources only PCRS coincides with 
PCR6. 
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Evaluations of Evidence Combination Rules in Terms 
of Statistical Sensitivity and Divergence 


Deqiang Han 
Jean Dezert 
Yi Yang 


Abstract—The theory of belief functions is one of the most 
important tools in information fusion and uncertainty reasoning. 
Dempster’s rule of combination and its related modif ed versions 
are used to combine independent pieces of evidence. However, 
until now there is still no solid evaluation criteria and methods 
for these combination rules. In this paper, we look on the evidence 
combination as a procedure of estimation and then we propose 
a set of criteria to evaluate the sensitivity and divergence of 
different combination rules by using for reference the mean 
square error (MSE), the bias and the variance. Numerical 
examples and simulations are used to illustrate our proposed 
evaluation criteria. Related analyses are also provided. 


Keywords—belief functions; evidence combination; evaluation; 
sensitivity, divergence. 


I. INTRODUCTION 


The theory of belief functions, also called Dempster-Shafer 
evidence Theory(DST) [1], is one of the most important 
theories and methods in information fusion and uncertainty 
reasoning. It can distinguish ‘unknown’ and ‘imprecision’ and 
propose a way to fuse or combine different pieces of evidence 
by using the commutative and associative Dempster’s rule of 
combination. 


Dempster’s rule of combination can bring counter-intuitive 
combination results in some cases [2], [3], so there have 
emerged several improved and modifed alternative evidence 
combination rules, where counter-intuitive behaviors are im- 
puted to the combination rule itself, especially the way to 
deal with the conf icting mass assignments. The representative 
works include Yager’s rule [4], Florea’s robust combination 
rule (RCR) [5], disjunctive rule [6], Dubois and Prade’s rule 
[7], proportional conf ict redistribution rule (PCR) [8], and 
mean rule [9], etc. 


As aforementioned, several combination rules are available 
including Dempster’s rule and its alternatives. Then, how 
to evaluate them? This is crucial for the practical use of 
the combination rules. The qualitative criterion is that the 
combination results should be intuitive and rational [10]. Up to 
now, there is still no solid performance evaluation approaches 
for combination rules, especially for establishing quantitative 
criteria. In this paper, we propose to interprept the evidence 
combination as a procedure of estimation [11]; therefore, a 
combination rule is regarded as an estimator. So, we def ne 
some statistical criteria on sensitivity and divergence for the 


Originally published as Han D., Dezert J., Yang Y., Evaluations on Evidence 
Combination Rules in Terms of Statistical Sensitivity and Divergence, in Proc. of 
Fusion 2014 Int Conf onInformation Fusion, Salamanca, Spain, July 7-10, 2014, 
and reprinted with permission. 


different combination rules by using for reference the idea of 
Mean Square Error (MSE) and its decomposition in estimation. 
By adding small errors to the original pieces of evidence 
(1.e., the “input” of the “estimator’”), we check the mean 
square error, the variance, and the bias of the combination 
result (“output” of the estimator) caused by adding some 
noise to describe the sensitivity and divergence of the given 
combination rule. Distance of evidence [12] is used in our 
work to def ne the variance, the bias and other related criteria. 
Simulation results are provided to illustrate our proposed 
evaluation approaches. Dempster’s rule and major available 
alternative rules are evaluated and analyzed using the new 
evaluation approaches. 


II. BASICS OF DST 


Dempster-Shafer evidence theory (DST) [1] has been 
developed by Shafer in 1976 based on previous works of 
Dempster. In evidence theory, the elements in frame of discern- 
ment (FOD) © are mutually exclusive and exhaustive. Def ne 
m : 2° — [0,1] as a basic belief assignment (BBA, also called 
mass function) satisfying: 


X m(A)=1, m(0) =0 (1) 


Ae2° 


If m(A) > 0, A is called a focal element. In DST, two reliable 
independent bodies of evidence (BOEs) mj(-) and ma(-) are 
combined using Dempster’s rule of combination as follows. 
VA€E2®: 


m(A) = ae ma (Ai)ma(B;) (2) 
oR AAD 
where 
os ee mı(Ai)m2(B;) (3) 


represents the total conficting or contradictory mass assign- 
ments. Obviously, from Eq. (2), it can be verif ed that Demp- 
ster’s rule is both commutative and associative. For Dempster’s 
tule of combination, the conficting mass assignments are 
discarded through a classical normalization step. 
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As frstly pointed out by Zadeh [2], Dempster’s rule 
has been criticized for its counter-intuitive behaviors!. DST’s 
validity has also been argued [3]. There have emerged several 
alternatives of evidence combination rules aiming to suppress 
the counter-intuitive behaviors of classical Dempster’s rule. 
See [8] for details. 


To measure the dissimilarity between different BBAs, the 
distance of evidence can be used. Jousselme’s distance [13] is 
one of the most commonly used distance of evidence, which 
is def ned as 


dz(m1,m2) = . ‘ (mı = m2)” Jac (mı = mg) (4) 


where the element J;; = Jac(A;, B;) of Jaccard’s weighting 
matrix Jac is defned as 


_ [Ain B;| 


Jac(A;, Bj) = |A;U B,| 
t I 


(5) 
There are also other types of distance of evidence [12], [14]. 
We choose to use Jousselme’s distance of evidence in this 


paper, because it has been proved to be a strict distance metric 
[15]. 


III. SOME MAJOR ALTERNATIVE COMBINATION RULES 


In this section, some major combination rules in evidence 
theory other than Dempster’s rule are brief y introduced. For 
all A € 2° 


1) Yager’s rule [4]: 


m(@) =0 
My ager (A) = De mı(Bi)m2(C;) 
BiNCj;=A#0 (6) 
m(O) = m1(O)m2(O) + X mi (Bi)me(C;) 
Bnc=0 


In Yager’s rule, the conf ict mass assignments are assigned to 
the total set of the FOD ©. 


2) Disjunctive rule [6]: 
Bi UC; =A 


m(@) =0 
mpis(A)= X m(Bi)m2(C;) (7) 


This rule refects the disjunctive consensus. 


3) Dubois & Prade’s rule (D&P rule) [7]: 


m(Ø)=0 
mpp(A) = 5 mı(Bi)m2(C;) 
BiNCj;=A#0 (8) 
+ by my (Bi )m2(C;) 


BiNC;=0,B,;UC;=A 


This rule admits that the two sources are reliable when they 
are not in conf ict, but only one of them is right when a conf ict 
occurs. 


! According to the viewpoint of proponents of Dempster’s rule, the counter- 
intuitive behavior is imputed to the sensors, the data or the BOEs obtained 
from different sources, but not to Dempster’s rule itself. 
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4) Robust Combination Rule (RCR, or Florea’s rule) [5]: 
mroR(A) = a(K)mpis(A) + B(K)mconj(A) (9) 


where Mpis is the BBA obtained using the disjunctive rule, 
McConj is the BBA obtained using the conjunctive rule, and 
a(K), 8(K) are the weights, which should satisfy 


a(k)+(1- K)6(K) =1 (10) 


where K is the confict coeff cient def ned in Eq. (3). Robust 
combination rule can be considered as a weighted summation 
of the BBAs obtained using the disjunctive rule and the 
conjunctive rule, respectively. 


5) PCRS [8]: Proportional Confict Redistribution rule 
5 (PCRS) redistributes the partial conficting mass to the 
elements involved. in the partial confict, considering the 
canonical form of the partial confict. PCRS is the most 
mathematically exact redistribution of conf icting mass to non- 
empty sets following the logic of the conjunctive rule. 


mpcrs(0) =0 
and VX € 2° \ {0} 


5 mı(Xı)mə(X2)+ 


X1,X2€29 
X1nX2=X 


mı (X m(X2) mə(X)?mı (X2) 
OE A 


mpcrs(A) = 


} ay 


X2€2° 
X2nX=0 


In fact there exists another rule PCR6 that coincides with 
PCRS when combining two sources, but differs from PCR5 
when combining more than two sources altogether and PCR6 is 
considered more eff cient than PCR5 because it is compatible 
with classical frequentist probability estimate [16]. 


6) Mean rule [9]: 


1 n 
Mmean(A) = =) _ mi(A) (12) 
By using this rule, we can fnd the average of the BBAs to be 
combined. 


For the purpose of the practical use of different com- 
bination rules, the evaluation criteria are required. In the 
next section, the available evaluation criteria or properties of 
evidence combination rules are berief y introduced. 


IV. PROPERTIES OF COMBINATION RULES AS 
QUALITATIVE CRITERIA 


1) Commutativity [17]: The combination of two BBAs mı 
and mz using some rule R does not depend on the order of 
the two BBA, i.e., 


R(m1, m2) = R(m2, mı) (13) 


All the combination rules aforementioned in Section III are 
commutative. 
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2) Associativity [17]: The combination result of multiple 
BBAs does not depend on the order of the BBAs to be 
combined. For example, when there are 3 BBAs, 


R(R(m1, m2), m3) = R(mı, R(m2, m3)) (14) 


Dempster’s rule and disjunctive rule are associative. The other 
rules introduced in Section III are not associative. The property 
of associativity is important to facilitate the implementation of 
the distributed information fusion system. But it should be 
noted that it is not necessarily eff cient in term of quality of 
fusion result. Non-associative rules are able to provide better 
performances in general than associative rules [16]. 


3) Neutral impact of the vacuous belief [17]: The combi- 
nation rule preserves the neutral impact of the vacuous BBA, 
i.e., when mz is m(O) = 1, 


R(m1, m2) = mı (15) 


All the rules aforementioned in Section HI but the mean rule, 
satisfy this property. 


These criteria are qualitative and they correspond to good 
(interesting) properties that a rule could satisfy. It should be 
noted that these “expected good” properties do not warrant 
that a real eff cient fusion rule must absolutely satisfy them. 
Therefore, these properties are not enough to the evaluations of 
combination rules. In this paper, we propose some quantitative 
evaluation criteria for combination rules. 


V. STATISTICAL SENSITIVITY AND DIVERGENCE OF 
COMBINATION RULES 


Here, we develop a group of criteria for combination rules 
in terms of sensitivity and divergence. The idea of Mean 
Square Error (MSE) and its decomposition are used as a basic 
framework for such a development. 


A. Mean Square Error and its decomposition 


For an estimate ĉ of the scalar estimand x, the MSE is 
def ned as 
MSE(@) = E[(@ — x)°] (16) 


MSE can be decomposed as 


MSE(@) = E[(é — E(#))*] + E[(E(é) — 2)? 
= Var(ĉ) + (Bias(#, x))? 


The MSE is equal to the sum of the variance and the squared 
bias of the estimator or of the estimations. The variance can 
represent the divergence of the estimation results. The bias can 
represent the sensitivity of the estimator. 


(17) 


B. Criteria for statistical sensitivity and divergence 


If we consider the procedure of evidence combination with 
a given rule as an estimator (as illustrated in Fig. 1), then we 
can consider the combination results as the estimations. 


So, we can use for reference the MSE and its decompo- 
sitions to measure the error, the variance, and the bias of the 
combination results based on the given combination rule. Here 
we attempt to design some criteria related to the sensitivity and 
divergence of combination rules. We use the change of the 
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combination results after adding small noise to the original 
BBA to ref ect the sensitivity and divergence of a combination 
tule. If under a given small noise, a combination rule bring 
out smaller variance and smaller bias, then such a rule is less 
divergent and less sensitive, based on which, the sensitivity 
and divergence of combination rules can be evaluated. The 
def nitions of MSE, variance and bias for combination rules, 
and the evaluation procedure are as follows. 


Measurements ——->| 
Estimator Estimate 
Priori = ——> 
Information 
Evidences 9 ——> ; 
ee New Evidence 
Rule Selection ——» Combination 








Fig. 1. Evidence combination and Estimation. 


Step 1: Randomly generate a BBA m. Add random noise to 
m for N times, respectively. In each time, the noise is €; (small 
values), where 7 = 1,..., N. The noise sequence is denoted 
by e = [e1,€2,...,€w]. Here each e; is a small real number 
(negative or positive) close to zero. Then, we can obtain a 
sequence of noised BBAs as 


mny] (18) 


It should be noted that all the noised BBAs are normalized. 


m’ = [mi, mə, sir 


Step 2: Generate original combination results sequence 


with a combination rule R 
iala N 
me = [m}, mi, ma | (19) 


= [R(m, m), R(m, m), .., R(m, m)] 
The length of me is N. 


Step 3: Generate combination results sequence by combin- 
ing BBAs with noise and the original BBAs using the rule of 
R 


Men = [Min Mns serg my] (20) 
= [R(m,m), R(m2, m), ..., R(mn,m)| 
Step 4: Calculate the MSE of Men as 
1 a i i \]2 
MSEppa(Men) = H > [dy (mi, min )] 
1 (21) 


= 4 E (ay(R(m,m), Rm, m:))}? 


where dz is Jousselme’s distance def ned in Eq. (3). MSEppa 
represents the error between the original combination results 
and the results obtained using BBAs with noise. 


We can also calculate the relative MSE by removing the 
effect of the noise amplitude as follows 


MSEppa (Men) 


MSEspa (Men) = lel? 


(22) 
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Step 5: Calculate the variance of Men as 
i Ë i 2 
VarpBa (Men) =V Z [dI (Min, Men)| 
i=1 
i& ix í pa 
=- V >% dy(R(m, mi), WV D R(m, mj)) 
i= j= 
z j N 
where Men = Ar mi, = + j=j R(m, mj). VarBBA 
represents the fuctuations of the combination results obtained 
using BBAs with noise. 


Then, calculate the relative variance by removing the effect 
of the noise amplitude as follows 


VarBBa (men) 


Var(e) CD 


Varppa (Men) = 


Relative variance in fact represents the degree of am- 
plif cation or reduction of the variances between and after 
combination. 


Step 6: Calculate the bias of Men as 


N 
Biasppa (Men) = + oS [dy (mi, Men)? 





Biasgga represents the difference between the expectation of 
the combination results obtained using BBAs with noise and 
the original combination results. 


Then, calculate the relative bias by removing the effect of 
the noise amplitude as follows 


; Biasgga (m 
Biasppa (Men) = ee 


Regenerate randomly a new original BBA m for M times. 
In each time, re-do Step 1 to Step 6. Based on the M groups 
of results, calculate the averaged MSE%p,,, the averaged 
Varppa- and the averaged Biasgp,. These three indices are 
called the statistical MSE, the statistical variance, and the 
statistical bias of the combination rule R. We jointly use 
these indices (quantitative criteria) to describe the statistical 
sensitivity and divergence of a given combination rule R. 


(26) 


Relative MSE is a comprehensive index. Larger relative 
MSE intuitively means larger sensitivity. However, relative 
MSE is insuffcient to evaluate a combination rule. So we 
should further use its decomposition (including the relative 
variance and the relative bias) for a deeper analysis. 


High relative bias values represent high sensitivity. it rep- 
resents high degree of departure from the origin. It can ref ect 
a given combination rule’s capability of sensitive response to 
the changes in input evidences. It represents the “agility” of a 
combination rule. Moderate relative bias values are preferred, 
which means the balance or trade-off between the robustness 
and the sensitivity. 
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Relative variance in fact represents the degree of amplif ca- 
tion or reduction of the variances between and after combina- 
tion. In the evaluation procedure, for all the combination rule, 
the variance of the noise are the same (using the same noise 
sequence for different rules). So, high relative variance values 
also represent high divergence among all the combination 
results using a given combination rule when adding noise. 
Small relative variance values are preferred, which represent 
the high cohesion of a given combination rule. 


In this work, we propose a statistical evaluation approach 
for evidence combination rules based on Monte-Carlo simula- 
tion. To implement the statistical evaluation of a combination 
rule according to the method introduced here, two problems 
should be resolved at frst. One is the way of adding noise and 
the other is the way of random generation of BBA. 


C. Method I for adding noise 


Method I for adding noise is designed to evaluate the effect 
of the slight value change of the mass of the existing focal 
element. Suppose that m is a BBA defned on FOD ©. First, 
we fnd the primary focal element (the focal element having the 
highest mass assignment) ?, i.e., the focal element A; satisfying 


i = argmaxm/(A;) (27) 
J,AjCO 


Second, add the noise e to the mass assignment of the 
primary focal element. 


m (Ai) = m(A;) - (1 + €) (28) 


Then, for the mass assignments of other focal elements in 
original BBA, 


m(Aj;) 


m'(A;) == 1—m(A;) 


= m/(A;) -e-m(A;), Vj Ai (29) 


m’ is the generated BBA with noise. It is easy to verify that 


Dupes m'(B) =1 (30) 


It can be seen that the change of mass assignment for the 
primary focal element is the most signif cant when compared 
with those of other focal elements. The change of the mass 
assignment for primary focal element is redistributed to all the 
other focal elements according to the ratio among their corre- 
sponding mass assignments. BBAs are generated according to 
Algorithm 1 below [12]. 


For method I for adding noise, some restrictions should be 
adopted for the values of original BBA and the noise added 
to make sure that the noised BBA m’ satisf es the def ntion of 
BBA. The restriction are as shown in Eq. (31). 


0 < (1+ €) - max(m(A)) < 1,VA € oF (31) 


For example, when © = {01,02} and m({01}) = 0.8, m({02}) = 
0.1, m(©) = 0.1, the primary focal element is {61}. When © = {01,02} 
and m({01}) = 0.45, m({62}) = 0.45,m(©) = 0.1, the primary focal 
elements are {0} and {02}. 
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Algorithm 1. Random generation of BBA 


Input: ©: Frame of discernment; 

Nmax: Maximum number of focal elements 
Output: Output: m: BBA 

Generate P(Q), which is the power set of ©; 
Generate a random permutation of P(O) > R(0); 
Generate an integer between 1 and Nmaz > l; 
FOReach First k elements of R(O) do 

Generate a value within [0,1] > mi, i = 1,..., 1; 
END 

Normalize the vector m = 
m(A;) = m}; 


4? 


[m4,...,m] > m’; 


where m is the original BBA. 


D. Method II for adding noise 


Method II for adding noise is designed to evaluate the effect 
of creating new focal elements. Suppose that m is a BBA with 
a special structure def ned on FOD ©. The focal elements are 
some singletons {0;} and the total set ©. First, fnd out a pair 
of singletons {6;} and {0;}. 


Second, create a new focal element {6;, 6; } with the mass 
value of €, i.e., m’({6;,6;}) =e. 


Then, the mass values for focal elements {6;} and {6;} 
are regenerated as 
i} 


l mAH = mAH -e a Dan 
MAGY = m({9;}) - €- <p 


Obviously, one has ` pco m'(B)=1. 


(32) 





The BBAs with special structure (with only some 
singletons and the total set focal elements) are generated 
according to Algorithm 2 below: 


Algorithm 2. Random generation of BBA 


Input: ©: Frame of discernment; 

n: Cardinality of O; 

Nmax: Maximum number of focal elements 
Output: m: BBA 

Generate P(©), which is the power set of O; 
Generate a random permutation of P(O) > R(0); 
FOR i = 1 : Nmaz— 1 

Generate an integers j between 1 and n; 
Generate a focal element F; : {0;}; 

END 

Generate a focal element F'y,,,,, : O. 

FOR i = 1 : Niaz 

Generate a value within [0,1] > m,; 


END 
Normalize the vector m = [m1, ..., MN mas] > M’; 
m(Fi) = mi; 


For method II for adding noise, some restrictions should 
be adopted for the values of original BBA and the noise added 
to make sure that the noised BBA m’ satisf es the def ntion of 
BBA. According to Eq. (32), the restriction can be obtained as 
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shown in Eq. (33). For all the available singleton focal element 
{6;} in original BBA, 


0 < m({0:}) - (1 — 


€ 
D maop>o PAG) 


where m is the original BBA. 


)<1,VA € 2° 
(33) 


E. A simple illustrative example 


Here an illustrative example of single cycle calculation of 
the evaluation indices is provided by using Method I for adding 
noise. By referring to this illustrative example, evaluations by 
using Method II are easy to implement. 


A BBA m defned on the FOD © = {61, 42, 03} is 
m({01}) = 0.6, m({@2}) = 0.3, m({41, 62, O3}) = 0.1. 


Suppose that the noise sequence is 
= [—0.1, —0.05, —0.02, 0.02, 0.05, 0.1]. 


It can be seen that the restrictions in Eq. (31) are not 
violated. 


According to the Step 1, we generate the sequence six 
noised BBA w’ = [m,, M2, M3, M4, M5, Meg] as follows: 


m1 ({61}) = 0.5, mı ({82}) = 0.375, mı ({61, 92, 63 }) = 0.125; 
m2({61}) = 0.55, m2 ({02 }) = 0.3375, m2 ({01, 92, 03}) = 0.1125; 
y= ({62}) = 0.315, m3 ({01, 62, 03}) = 0.105; 

ma({61}) = 0. 62, ma({02}) = 0.285, m4 ({01, 02, 43}) = 0.095; 
st = 0.65, m5 ({02}) = 0.2625, ms ({61, 02, 03}) = 0.0875; 
me ({61}) = 0.7, me ({O2}) = 0.225, me ({01, 802, 03 }) = 0.075. 


Here we use Dempster’s rule of combination. Then, ac- 
cording to the Step 2, the original combination sequence 


Zap lay 6) ; 
me = [me,mZ,..., mË] is as 


Vi = 1,...,6. 
mi({01}) = 0.75, mi ({02}) = 0.2344, mi ({61, 02 03}) = 0.0156. 


Then according to the Step 3, the sequence of combination 
results by combining BBAs with noise and the original BBAs 
Men = (m1, Mons +) Men] is as 





men ({01} = 0.6627, m2, ({02}) = 0.3162, 
Men ({01, 02, 03}) = 0.0211; 
m2,,({01}) = 0.7076, m2, ({02}) = 0.2741, 
m?n ({01, 02, 03}) = 0.0183; 
me, ({61}) = 0.7334, m3, ({02}) = 0.2500, 
m,,({01, 02,03}) = 0.0167; 
mé,,({01}) = 0.7662, m4, ({02}) = 0.2192, 
men ({91, 62, 03}) = 0.0146; 
men({01}) = 0.7895, m2, ({02}) = 0.1973, 
men( {0 2,03}) = 0.0132; 
mê, ({01}) = 0.8257, mS, ({02}) = 0.1634, 
men ({O1, 62, 03}) = 0.0109. 


For the noise sequence, 
jell? = 0.0258 
Var(e) = 0.0043 
According to the Step 4, the value of MSE is 
MSEgpa (Men) = 0.00270 
MSEgpa (Men) = 0.00270/0.0258 = 0.104752 
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According to the Step 5, the value of variance is 
Varppa (Men) = 0.002697 
Varppa (Men) = 0.002697/0.0043 = 0.6272 
In the fnal, according to the Step 5, the value of bias is 
Biasgga (Men) = 0.002406 
Biash pa (Men) = 0.002406/V0.0258 = 0.014978 


The above is the illustration of one-cycle procedure. One 
can use other combination rules to do these steps. Randomly 
generate BBAs and repeat all the steps, then we can obtain the 
f nal statistical evaluation results. 


VI. SIMULATIONS 
A. Simulation I: using Method I for adding noise 


In our simulations, the cardinality of the FOD is 3. In 
random generation of BBAs, the number of focal elements 
has been set to 5. The length of the noise sequence is 50 (the 
noise value starts from -0.1, with an increasing step of 0.004, 
up to 0.1. Of course, the zero value for noise is not considered 
because it corresponds to noiseless case.). In each simulation 
cycle, seven combination rules including Dempster’s rules 
and other alternatives aforementioned in Section III are used, 
respectively. We have repeated the Monte Carlo simulation 
with 100 runs. In random generation of original BBAs, the 
restrictions in Eq. (31) are not violated. The statistical results 
are listed in Tables I-III. The ranks of the relative MSE, relative 
variance and relative bias are obtained based on the descending 
order. 


It should be noted that when using RCR in our simulation, 
the weights are generate as follows. 


{ ot) = ge 


BK) = ae a 


TABLE I. COMPARISONS IN TERMS OF MSE 
Combination Rules MSEppa Rank 
Dempster’s rule 0.0010758 1 
Yager’s rule 0.0005743 5 
Disjunctive rule 0.0004298 $ 
D&P rule 0.0006247 4 
RCR 0.0007152 3 
PCR5 0.0010505 2 
Mean rule 0.0005711 6 
TABLE II. COMPARISONS IN TERMS OF VARIANCE 


Combination Rules VarBBA Rank 


Dempster’s rule 1.7260 1 
Yager’s rule 0.9568 6 
Disjunctive rule 0.7469 7 
D&P rule 1.0226 4 
RCR 1.2788 3 
PCR5 1.6801 2 
Mean rule 1 5 


As we can see in Tables I - III, Dempster’s rule are the 
most sensitive to the mass change according to the criterion of 
the relative bias, and it also has highest degree of divergence 
according to the criterion of relative variance. Mean rule is the 
most insensitive to the mass change according to the criteria 
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TABLE III. 


Combination Rules Biasppa Rank 


Uhh eke LA se a a eee aee ear 
Dempster’s rule 0.81132*10 1 


COMPARISONS IN TERMS OF BIAS 


Yager’s rule 0.59228* 1077 2 
Disjunctive rule 0.41949*10—7 4 
D&P rule 0.49075* 1077 3 
RCR 0.39592*1077 5 
PCR5 0.38066*1077 6 
Mean rule 0 7 


of relative bias, and it is always a rule with smaller divergence 
according to the criterion of the relative variance. Yager’s rule 
is always more sensitive to the mass change and is always not 
so divergent.PCRS rule is not so sensitive to the mass change 
according to the criterion of Bias (rank 6), and it is not so 
divergent according to the criterion of the relative variance. 
The Robust combination rule (RCR), Dubois & Prade’s rule 
(D&P rule) are always moderate to the mass change in terms 
of sensitivity and in terms of divergence. So, PCRS and RCR 
are more moderate rules; thus, they are relatively good choices 
for practical use. 


B. Simulation IT: using Method II for adding noise 


In our simulations, the cardinality of the FOD is 3. In 
generation of BBAs, the total set © is used as a focal element 
and the number of singleton focal elements has been set to 
2. The length of the noise sequence is 50 (the noise value 
starts at 0.002 with an increasing step of 0.002, up to 0.1.) 
In each simulation cycle, seven combination rules including 
Dempster’s rules and other alternatives aforementioned in 
Section II are used, respectively. We have repeated the Monte 
Carlo simulation with 100 runs. In random generation of 
original BBAs, the restrictions in Eq. (33) are not violated. The 
statistical results are listed in Tables IV-VI. The ranks of the 
relative MSE, relative variance and relative bias are obtained 
based on the descending order. 


The derivation of weights of RCR has been done in the 
same manner as for the Simulation I. 


TABLE IV. COMPARISONS IN TERMS OF MSE 


Combination Rules 
Dempster’s rule 


MSEppa Rank 
0.0050 3 


Yager’s rule 0.0038 5 
Disjunctive rule 0.0033 6 
D&P rule 0.0014 7 
RCR 0.0078 1 
PCR5 0.0043 4 
Mean rule 0.0056 2 
TABLE V. COMPARISONS IN TERMS OF VARIANCE 

Combination Rules VarBBA Rank 
Dempster’s rule 0.9218 3 
Yager’s rule 0.7019 5 
Disjunctive rule 0.5434 6 
D&P rule 0.2714 7 
RCR 1.3528 1 
PCR5 0.8014 4 
Mean rule 1 2: 


As we can see in Tables IV - VI, RCR is the most 
sensitive to the change of focal elements according to the 
criterion of the relative bias, and it also has highest degree 
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TABLE VI. COMPARISONS IN TERMS OF BIAS 


Combination Rules Biasppa Rank 


Dempster’s rule 0.0016 3 
Yager’s rule 0.0012 5 
Disjunctive rule 0.0011 6 
D&P rule 0.0004 T 
RCR 0.0025 1 
PCR5 0.0013 4 
Mean rule 0.0018 2 


of divergence according to the criterion of relative variance. 
Dubois & Prade’s rule (D&P rule) is the most insensitive 
rule according to the criteria of relative bias, and it is always 
a rule with smaller divergence according to the criterion of 
the relative variance. Yager’s rule is always insensitive and is 
always not so divergent. Mean rule is sensitive to the change of 
focal element according to the criterion of Bias (rank 2), and it 
is divergent according to the criterion of the relative variance. 
Dempster’s rule is not so sensitive to the change of focal 
element. The PCRS and Yager’s rules are always moderate 
to the change of focal elements in terms of sensitivity and in 
terms of divergence. 


According to simulations results, we see that the different 
methods of adding noises impact differently the results of the 
comparative evaluations. However, we have shown that no 
matter the method adopted (by keeping the original core of the 
BBA, or modifying it slightly), PCR5 provides quite robust 
results for combining two BBA’s and thus offers practical 
interests from this standpoint. 


VII. CONCLUSION 


In this paper we have proposed a group of statistical criteria 
for evaluating the sensitivity of different combination rules 
with respect to the noise perturbations. The design is based 
on the classical measures of performance like MSE, variance, 
and bias encountered in the estimation theory. We don’t rank 
the rules according to their a priori “good expected” properties. 
Moderate relative bias values are preferred, which means the 
balance or trade-off between the robustness and the sensitivity. 
Small relative variance values are preferred, which represent 
the high cohesion of a given combination rule. Seven widely 
used evidence combination rules were evaluated using the new 
proposed evaluation criteria. PCRS is a moderate rule which 
is good for the practical use for combining two BBAs. For 
combining more than two BBAs, we expect that PCR6 will 
be a good choice, but we need to make more investigations in 
future to evaluate precisely its performances. 


In this work, we have added some noises to BBAs mainly 
by modifying the mass assignments of the primary focal 
element and by creating new focal elements. In our future 
work, we will try to use other methods to add noise to BBAs, 
e.g., eliminating some of original focal elements. In our Monte- 
Carlo simulations, there is no pre-settings of mass assignments 
for the BBAs. In this paper, in each cycle we only generate one 
BBA, based on which, we generate a sequence of BBAs by 
adding small noise. The BBAs to be combined are the original 
BBAs and the BBAs with small noise. In our future work, 
we will try to generate two BBA sequences and add noise 
to them, respectively, where we can use some special BBAs 
in the evaluation procedure, e.g., BBAs to be combined are 
high conficting. Then we can do more specifc performance 
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evaluations on the combination rules. In this paper, we did 
only focus on the property of sensitivity and divergence. The 
evaluation criteria of other aspects of evidence combination are 
also required for evaluating and designing new combination 
rules, which will be investigated in future research works and 
forthcoming publications. 
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A New Self-Adaptive Fusion Algorithm Based on DST 
and DSmT 
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Abstract—A new self-adaptive fusion algorithm based on DST 
and DSmT is proposed. In the new algorithm, part of the 
conflicting information is normalized according to DST, while the 
other part is processed by DSmT. A controlling factor is used to 
control the quantity of information dealt by the two different 
methods adaptively, which is a new method avoiding setting for 
the threshold of conflict. The simulation results indicate that the 
new self-adaptive fusion algorithm based on DST and DSmT can 
deal with any conflicting situation with a good performance of 
convergence. 


Keywords—DST; DSmT; 
fusion 


controlling factor; information 


I. 


Due to the complexity of modern battlefield, target 
identification is becoming more and more complex. It is 
difficult to give an accurate and credible identifying result 
only by one sensor. Therefore, target identification based on 
multi-source information is becoming a hot topic. Dempster- 
Shafer theory (DST) is an efficient method for uncertainty 
consequence ([1]). It is widely used in the domain of 
synthesize identification. However, DST can’t give efficient 
fusion results when information from different sources 
becomes highly conflict. Many improvement are proposed, 
such as Yager’s rule of combination ([2]), Murphy’s rule of 
combination ([3]), Dengyong’s rule of combination ([4]) etc. 
Dezert presented the Dezert-Smarandache theory (DSmT) 
([5]), which can be considered as an extension of the classical 
DST. DSmT performs well in dealing with the fusion of 
uncertain, highly conflicting and imprecise sources. It can 
solve not only the static problems but also the complex 
dynamic fusion problems. 

DST and DSmT have their own advantages and 
disadvantages for fusing the multi-source information. The 
advantages of DST mainly occur in the case of low degree of 
conflict, whereas it may give a bad fusion result which is 
absolutely contrary to the fact while the sources are in high 
degree of conflict. DSmT is more efficient in combining highly 
conflicting sources, but it offers convergence toward certainty 
slowly especially in low degree of conflict. So a new self- 
adaptive fusion algorithm is put forward in this paper. 
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Originally published as Yu X.H., Zhou Q.-J., Li Y.-L., An J., Liu Z.-C., A 
new self-adaptive fusion algorithm based on DST and DSmT, in Proc. of 
17th Int Conf on Information Fusion, Salamanca, Spain, July 7-10, 
2014, and reprinted with permission. 


Il. REVIEW OF THE THEORY OF EVIDENCE 


A, DST 


DST was firstly proposed by Dempster in 1967 and 
extended by Shafer. The main idea will be reviewed as follows. 
Let © = {0,,---,0,} be the frame of discernment of the fusion 
problem and all elements of © are exclusive. A basic belief 


assignment (BBA) m :2° — [0,1] is defined as 


>) m(A)=1 
Ae2?® (1) 
mg) =0 


where 2° is the power set of © and it includes all its subsets . 
For two independent bodies of evidence whose BBAs are 
denoted by m, and m, respectively, the BBA of the 


combination of the two bodies is given by the following rule 





> m(A)m,(B) 
mons |e —_vx 00X46 O) 
0,X =¢ 
where k = £ m,(A)m,(B) reflects the conflict degree of the 
two ae i 
B. DSmT 


DSmT, an extension of DST, was developed by Dezert 
and Smarandanche ([5]). DSmT differs from DST at that the 


elements of © could be overlapped. For simplicity, use D° 
(Hyper-power set) to denote the set of all compositions built 
from elements of © with U and N operators. The 
generalized basic belief assignment (GBBA) m : D? — [0,1] is 
defined as 


>) m(A)=1 
AcD® (3) 
m() =0 


Similarly, the classical combination rule for two 
independent bodies of evidence, whose GBBAs are m, and 


m, respectively, is given by 
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>) sm (A), (B), VX € D®, X # 
(X) = 5 4,BeD° ,ANB=X (4) 
0,X =¢ 
It is remarked that DSmT keeps the conflicting information 
and doesn’t need normalization. 

The biggest difference between DST and DSmT can be 
intuitively described as follows. Let © = {0,,0,} be the frame 
of discernment (without any extra conditions). The DST deals 
with BBA m(.) €[0,1] such that m(6,) + m(6,)+m(6,U6,) =1, 
while the DSmT deals with GBBA m()e[0,1] 
that m(6,) + m(6,) + m(6, U 0,)+m(0, N6,)=1. 


Muro 


such 


C. Self-adaptive fusion algorithm based on DST and DSmT 


Although DST behaves well while fusing sources are in 
low degree of conflict, it may give a fusion result that 
absolutely contrary to the fact while two sources are in high 
degree of conflict. Luckily, DSmT is capable of fusing the 
highly conflicting sources. A natural idea comes out that a 
self-adaptive fusion algorithm based on DST and DSmT could 
be used to obtain better performance in a way of simple 
combination, that is, if the degree of conflict is less than a 
given threshold, Dempster-Shafer combination rule can be 
used, otherwise DSm combination rule will be selected. 

The limitation of applying the above idea in practice is 
that a threshold of conflict should be set in advance. Itis very 
difficult to determine a suitable value for the conflicting 
threshold since different systems have different degrees of 
conflict, and an experiential value is usually used instead the 
true one by experimenting time after time. The risk lies that 
once the threshold is not suitable, fusion results will be bad. 

To solve the above problem, this paper proposed a new 
self-adaptive algorithm based on DST and DSmT, which 
avoids setting for the conflicting threshold in advance. 


HI. A NEW SELF-ADAPTIVE FUSION ALGORITHM 


Note that the key difficulty lies in how to process the 
conflicting information. Before introducing the new self- 
adaptive fusion algorithm, we will review the existing 
methods and explain why we adopt such an algorithm. 

According to DST, conflicting pieces of information 
between two bodies of evidence are eliminated by making 
normalization. Yager ([2]) distributed the conflicting 
information to the union of all elements, and viewed the 
conflict as unreliable and ignored it. Smets ([8]) distributed the 
conflicting information to empty set. He pointed out that all 
the sources of evidence are reliable but the frame of 
discernment is not complete and the actual result may lie out 
of the frame. The limitation of DST method is that it cannot 
deal with the cases of high degree conflicts. 

On the contrary, DSmT keeps the conflicting information 
useful and redistributes the conflicting information according 
to some principles while fusing. It is capable of coping with 
the cases of high degree conflicts, but it offers slow 
convergence for the result. 

To absorb the advantages of DST and DSmT, the new 
algorithm will treat the conflicting information in a 
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combination way, that is, part of the conflicting information 
will be distributed to the nonempty set averagely and the other 
will be redistributed by other principles. A controlling factor 
will be proposed to decide the mass of conflicting information 
to be normalized or not. 


A. Evaluation of conflict between bodies of evidence 


Definition 1 (Conflict)[6]. A conflict between two beliefs in 
DS theory can be interpreted qualitatively as one source 
strongly supports one hypothesis and the other strongly 
supports another hypothesis, and the two hypotheses are not 
compatible. 

It is well known that, the key problem of designing self- 
adaptive fusion algorithm based on DST and DSmT is how to 
compute the conflict between two bodies of evidence. Next we 
will show that the existing measures, including the conflict 
coefficient used in DST and DSmT and the degree of 
similarity, are not suitable to act as a eligible measure for 
conflicting information, although the former has long been 
taken as a fact in the Dempster-Shafer theory community. We 
also propose a new measure to fill in this gap. 


Example 1. Let © be a frame of discernment with n 
hypotheses {6,,---,9,}. Assume m, and m, are two BBAs 
offered by two distinct sources which are defined as 
m(0)=1/n, m,(@)=1/n, i=1,2,--,n 
Obviously, the two BBAs are totally consistent with each 
other. So there shouldn’t exist conflict. Firstly, compute the 


conflict coefficient and get k =1—1/n , where n is the 
number of hypotheses in frame of discernment. Secondly, It 


can be found out that along with the increase ofn , k will 
increase and approach to 1, as shown in the relationship 
between k and n is shown as Fig.l. If k is taken as a 
measurement of the degree of conflict, the two bodies of 
evidence are in high degree of conflict when 7 is larger than 
5. It is surely contrary to the fact. 

















Fig.1. Relationship between k and Nin example 1 
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Example 2. Let © be a frame of discernment with two 
hypotheses {0,,0,}. The BBAs offered by two sources are 
defined as 
m,(6,) =P, m (8; ) =l- p 
m,(0)=1-p., m, (0,)=p 
where p €[0,1] . 


evidence are highly contradicted with each other. According to 
DST, the conflict coefficient can be calculated 


ask = p> +(l-— p}. Fig 2 shows the relationship between 


It is obvious that the two bodies of 


k and p . As p changes from 0 to 1, k will decrease from 1 
to 0.5 and then increase to 1 again, as shown in Fig 2. Note 
that k is low than 1 at most time especially while p is 


around 0.5. It can’t reflect the conflict between two sources of 
evidence rightly. 

















Fig.2. Relationship between k and Nin example 2 
From Example 1 and 2, we can see that the conflict 


coefficient k can’t be used as a suitable measure of conflict. 

To overcome the above shortage, some other evaluating 
methods of conflict are put forward, such as the degree of 
similarity. Since the similarity of two bodies of evidence can 
also reflect their conflict, it can be used as a candidate to 
reveal the conflict between two sources of evidence. Generally 
speaking, the larger the degree of similarity is, the smaller the 
conflict is. 

Usually, the degree of similarity can be calculated by the 
distance between bodies of evidence. There are some kinds of 
distance being used in information fusion, such as the famous 
the Euclidean distance ([7]) proposed by Cuzzolin, the 
Bhattacharyya distance ([8]) given by Ristic and Smets. Both 
of them are defined in the frame of Dempster-Shafer theory. 
Nevertheless, neither of them can reflect the similarity of the 


subset of frame © . Besides, Tessem turned the belief function 
into the probability function by pignistic transformation and 
evaluated the distance between bodies of evidence in the level 
of pignistic ([9]). However, the distance defined in this way 
doesn’t accord with the distance theory. No valuable distance 
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can be used in practice until the coming out of Jousselme 
distance ([10]), which is the most widely used distance of 
evidence at present. The definition of Jousselme distance is 
given as follows. 


Definition 2 (Jousselme distance). Let © be a frame of 
discernment with n hypotheses. The BBAs offered by two 


sensors are denoted as m, and m, . Distance between them 


can be defined as 
; 1 
Dison ma) = J (m =m)" Deon =m) (5) 


|4 NB 
j= 





[4UB, | 


matrix, and | Á| denotes the number of elements of A . It 


where D = (D, is a 2” x 2” -dimensional 


ij 


reflects the degree of similarity of the evidence. Formula (5) 
can be rewritten as 





; 1 
Dis(m,,m,) = {ha m i +|| m, I! —2<m,,m, >) 


where || m, |? =< m,,m, >, i=1,2, and 


2”. 2 


<m,,m, >=). > m (4)m,(B,) 


i=l j=l 


|AAB, | 
[4,UB, | 
is the product of two vectors. 

Note that in the frame of discernment in DSmT, 
hypotheses could be overlapped potentially. Then a 
generalized Jousselme distance is defined as follows. 


Definition 3 (generalized Jousselme distance). Let © be a 
fame of discernment in DSmT with n hypothesis. The two 


GBBAs offered by sensors are denoted as m, and mM, . 
Distance between them is defined as 


1 
Dis(m,,m,) = am —m,)' D(m, =m,) (6) 


) _ | A, N B j | 
4B, 
matrix, N is the number of elements in the power set of ©, 


and | Æ| denotes the DSm cardinality of A . Similarly, 
distance can be transformed as 


where D = (D, is a M x N -dimensional 


ij 





l 1 
Distmnm,)= lm I? +l] 7m, I? -2 < m,m, >), 


where || m, || =< m,,m, >, i=1,2, and 


| 4,8; | 


N N 
< m,m, = 2 Ane) UB 
1H 


i=l j=l 


VAeD®. 
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It is easy to see that Dis(m,,m,) €[0,1], and the 
degree of similarity can be defined as 
Sim(m,,m,) =1— Dis(m,,m, ) 


(7) 
Obviously, it has Sim(m,,m,) € [0,1]. 

In example 1, the Jousselme distance is always 0 no 
matter what the value of n be. In other words, the similarity 


between two bodies of evidence is always 1, it accords with 
the fact. In example 2, the Jousselme distance is computed 


as Dis(m,,m,) =|1—2p|. As p changes from 0 to 1, the 
Jousselme distance decreases from 1 to 0 and then increases to 
1 again. It is also reasonable in intuition. However, one cannot 
determine the value of conflict between two bodies of 
evidence correctly just by the similarity. We will show this by 
the following Example 3. 

Example 3. Let © be a frame of discernment with two 
hypotheses {O,,0,}. The two BBAs offered by sensors are 


defined as 
m,(6,) = 0.8, m (6, U@,) =0.2 
m,(6, U6,)=1 
Simple calculation will yield Dis(m,,m,) = 0.5657 , 
and Sim(m,,m,) = 0.4343 . If the conflict is estimated by 


the degree of similarity, the two sources of evidence are in 
conflict. However, it is obvious that the second body of 
evidence is totally unknown. Thus one can’t assert that they 
are in conflict. In other words, it is not credible to determine 
the degree of conflict only by on the degree of similarity. 

In summary, neither conflict coefficient nor degree of 
similarity can be used as the quantitative measure of conflict 
alone. A natural idea comes out that one may make a judgment 
objectively by considering the two factors synthetically. 
Actually, many researchers followed this way. For example, 
Jiang ([12]) took the average of Joussleme distance and 
conflict coefficient as the new measure for conflict. Liu ([6]) 
made the dualistic array by conflict coefficient and the 
distance between betting commitments to analysis the conflict 
under different situations. Liu ([13]) used the geometric mean 
of conflict coefficient and distance between betting 
commitments as the measure of the conflict. 

This paper also adopts the above idea. One will see in the 
following text that, in our new combination model there are 
two places where the degrees of conflict need to be estimated. 
On one side, the classical conflict coefficient is taken to 
measure the value of conflict. On the other side, the similarity 
between bodies of evidence is used as a controlling factor to 
distribute the conflicting information. See next for details. 


B. New combination rule 

Let ©={6,---,9,} be a discernment frame with n 
hypotheses. The hypotheses of the frame could be non- 
exclusive. D? is the hyper-power set. The generalized basic 
belief assignment is defined as m : D? — [0,1] where 
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>) m(A)=1 
AeD® (8) 
m(g) = 0 


Let m, and m, be the GBBAs of two sensors which are 


independent with each other. The new combination rule can be 
defined as 


>) m,(Aym,(B) + P(X) 





A,BeD® 
m,,(X) = £= E WX ED°,X#¢ (9) 
where k= >, m,(A)m,(B) reflects the mass of 


A,BeD® ,ANB=¢ 


conflict, k = ok is the conflict that will be distributed to the 
all hypotheses averagely by normalizing. The rest (l—o)k 


will be redistributed by other rules. Here o is a controlling 
factor. Denote the conflicting information that be distributed 
to hypothesis X by P(X), and thus >) P(X)=(1-o)k. 
XeD° 

It can be seen from formula (9) that, as o changes, the 
fusion results will be different. When o =0, all conflict 
information will be kept and distributed to the hypotheses 
which bring on the conflict, and then the new algorithm will 
degenerate to DSm combination rule. Wheno = 1, all conflict 
information will be distributed to all hypotheses averagely, 
and then the new algorithm will degenerate to DS combination 
tule. When ø e (0,1), part of the conflict will be kept as useful 
and the rest will be distributed averagely, then the new 
algorithm is the synthesis of DSm and DS combination rules. 

The new algorithm uses the degree of similarity as the 
controlling factor to adjust the fusion result adaptively. Let s 
be the number of sources. Ifs=2 , then o = Sim(m,,m,) ; 


ifs > 2 , then ø = min{Sim(m,,m,)|i,j =1,---,5}. 


While there are more than two sources to be fused, the 
new combination rule can be defined as 


X: Il m,(A,)+ P(X) 





m(X) = aaa 7 WX €D°,X#¢ (10) 
where k =ok, k= $, [[m,(4). 
NA=dlsj<s 


C. Conflict distribution rule 


To copy with the complex constraints in real systems, 
Dezert proposed the hybrid DSm combination rule, which 
works properly even if in high degree of conflict. However, 


due to the big number of element in D®, it cannot offer quick 
convergence and cost too much time for calculation. To 
achieve a better performance, some new distribution rules 
based on the DSm rule are put forward and are classified as 
PCRI~PCR6 according to distribution rules ({11]). It is 
remarked that PCRS is thought to be the most precise in 
distribution and its combination rule is defined as 
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m(X)= >) m,(A)m,(B) 


ANB=X 

, y [Om , Om] OD 
yep? |(m(X)+m, (Y) m (X)+m (Y) 

xNY= 


where X e D°, X #¢. 

According to PCRS, the conflicting information caused 
by X and Y will be distributed to themselves without 
considering of their union. In fact, the conflicting information 
is decompounded to two parts as m,(X)m,(Y) 
and m,(X)m (Y) , which will be distributed separately. 
Hypotheses X and Y will get the conflicting information in 
proportion to their basic belief assignments. Due to the high 
performance, PCRS is widely used in real systems. As an 
upgraded version of PCRS, PCR6 is the latest rule which is 
used to fuse more bodies of evidence and has also been 
applied in real systems. 

Our algorithm also adopts the PCRS for conflict 
distribution, and then the item P(X) in formula (9) could be 
rewritten as: 


P(X) =(1-o) > 


YeD? 
XNY=¢ 


m (Xm (Y) m (Xm) 
m(X)+m, (Y) m, (X)+m (Y) 





(12) 


If there are more than two sources, they can be combined 
one by one according to PCRS. In addition, they can also be 
combined according to PCR6, and the item P(X) in formula 


(9) could be rewritten as 
P(X)=(1-o) 


s-l 
[Tan Yay) 
py = 


s-l 

sol 

NYqa)X=9] m,(X) + 2 mao ano) 
T= 


(13) 





Sm 


JsJ<i 


where o,(j) = R 
iQ) a 


D. Realization of the new algorithm 

Let © = {0,:,0,} be a frame of discernment with n 
hypotheses. m,(4;) and m,(B,) are the basic belief 
functions of two sources which are independent with each 
other, where 3 m,(4,)=1, D m, (B,)=1. The new self- 


4 ED? B, €D? 
adaptive fusion algorithm based on DST and DSmT can be 
described as following four steps. Here we use m(X) to 
denote the fusion result. 
Step]. VX € D°, X #¢, set m(X)=0 andk=0. Compute 


Dis(m,,m,) ,Sim(m,,m,), and o = Sim(m,,m,) . 


Step2. Vi,j=1,2,--- , compute m,(4,)m,(B,) 
If 4B, 4 , then renew m(4,1\B,) with 
m(A,(\B;)+m,(A,)m,(B,) , else renew k with 
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k+m,(4,)m,(B;) ; renew m(A,) with 
m(A,)+(1-o)m; (4,)m, (B,)/(m(4,)+m,(B))) , and 
m(B;) 

m(B,) + (1—o)m,(A4,)m;(B,)/ (m,(4,) + m,(B,)); 


Step3. If all of m,(4,)m,(B,) have been computed, go to 


renew with 


step4; otherwise, go to step 2; 
Step4. VX eD? , if X=¢ , m(X)=0 ; otherwise, 
m(X)=m(X)/(l-ok). 


TV. NUMERICAL EXAMPLE 
Suppose five sensors are used to detect targets which are 
independent with each other. Let © = {A,B,C} be the frame 
of discernment. Elements in © are exclusive. 


Example 1 (no conflicting evidence): The basic belief 
assignments offered by five sensors are given as follows. 


Evidence 1: m,(A) = 0.5, m,(B) = 0.2,m (C) = 0.3; 
Evidence 2: m,(A) = 0.6,m,(B) = 0.2,m,(C) = 0.2 ; 
Evidence 3: m,(A) = 0.55,m,(B) = 0.1,m,(C) = 0.35; 
Evidence 4: m,(A) = 0.55,m,(B) =0.1,m,(C) = 0.35 ; 
Evidence 5: m,(A) = 0.55, m,(B) = 0.1,m,(C) = 0.35 ; 


It is easy to see that all bodies of evidence support the 
identity A and they are in low degree of conflict. Fusion 
results offered by different combination rules are given in 
table 1. 

Example 2 (one conflicting body of evidence): The basic 
belief assignments offered by five sensors are given as 
follows. 


Evidence 1: m,(A) = 0.5, m,(B) = 0.2,m (C) = 0.3; 
Evidence 2: m,(A) = 0,m,(B) = 0.9,m,(C) = 0.1; 
Evidence 3: m,(A) = 0.55,m,(B) = 0.1,m,(C) = 0.35; 
Evidence 4: m,(A) = 0.55,m,(B) = 0.1,m,(C) = 0.35 ; 
Evidence 5: m,(A) = 0.55, m,(B) = 0.1,m,(C) = 0.35 ; 


It is obviously that most bodies of evidence support the 
identity A but the second body of evidence supports the 
identity B. In other words, they are in high degree of conflict. 
Fusion results offered by different combination rules are given 
in table 2. 

Example 3 (two conflicting bodies of evidence): The basic 
belief assignments offered by five sensors are given as 
follows. 


Evidence 1: m,(A) = 0.5, m,(B) = 0.2,m (C) = 0.3; 
Evidence 2: m,(A) = 0,m,(B) = 0.9,m,(C) = 0.1; 
Evidence 3: m,(A) = 0.3, m,(B) = 0.6,m,(C) = 0.1; 
Evidence 4: m,(A) = 0.55,m,(B) =0.1,m,(C) = 0.35 ; 
Evidence 5: m,(A) = 0.55, m,(B) = 0.1,m,(C) = 0.35 ; 


As in example 2, most bodies of evidence support the 
identity A, but there are two bodies of evidence support the 
identity B. They are also in high degree of conflict. Fusion 
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results offered by different combination rules are given in 


TABLE 1. 


Ponseri | memenor D° 


table 3. 


FUSION RESULTS OF EXAMPLE 1 


m m,m,m, m m,m,m,Mms 


DST 


0.0211 0.0041 0.0007 


fa 0.6529 0.7087 0.7373 0.7506 


as 0.1426 0.0602 0.0281 0.0201 


0.2045 
0.7289 


New self-adaptive 0.1092 


combination rule 





0.2311 
0.8235 
0.0303 


0.2346 
0.8598 
0.0113 


0.2293 
0.8755 
0.008 


0.1619 0.1462 0.1289 0.1165 


TABLE 2. FUSION RESULTS OF EXAMPLE 2 


Piono [element of D° Se ee ens one 


a ae 8571 
0.1429 
0.2024 
0.6851 
0.1125 
0.1797 
0.7044 


New self-adaptive 
combination rule 


New self-adaptive 
combination rule 


Table 1 shows us that, DST is very suitable for fusing 
bodies of evidence in low degree of conflict. However, 
because PCRS keeps the conflicting focal elements, the 
support degree of A is just 0.7506 when fusing the fifth body 
of evidence. It has been shown in Fig.3 that the degree of A 
offered by PCRS is much less than the value offered by DST 
(0.9503). When applying the new self-adaptive algorithm, the 
support degree of A is 0.8755, which achieves a great 
improvement for that of PCRS. In other words, the new 
algorithm can fuse the sources of evidence in low degree of 
conflict well. 

It can be seen from table 2 that, due to the high degree of 
conflict, the mass of A in fusion results by applying the DST 
tule is always 0. Obviously, it is illogical in real world. By 
using the PCRS rule, one can make the right decision. 
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m m,m,m, 


= 6316 
0.3684 
0.37 
0.4482 
0.1818 
0.3727 
0.4441 


0.3288 
0.6712 
0.5101 
0.258 
0.2319 
0.5724 
0.2068 


However, we can see from Fig.4 that PCRS rule could only 
offer slow convergence, while the new self-adaptive fusion 
algorithm not only overcomes the shortage of DST whose 
fusion result is illogical but also makes the right decision with 
quick convergence. 

Table 3 gives the fusion results of example 3. Although 
there are two bodies of evidence which are in conflict with 
other evidence, our algorithm and PCR5 will give the right 
results. Besides, it can be seen from Fig.5 that, as the number 
of bodies of evidence increases, the new algorithm will get a 
better performance on convergence than PCRS. In conclusion, 
the new self-adaptive fusion algorithm is the best one among 
the three algorithms while dealing with high degree of conflict. 
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Fig.5. Support degree of A in Example 3 





To sum up, the new self-adaptive algorithm can deal with 
high degree of conflict with a good performance on 
convergence. 


V. CONCLUSION 


Based on DST and DSmT, this paper proposes a new 
self-adaptive fusion algorithm. A controlling factor is 
introduced to avoid setting of the conflict threshold. 
Simulation results show that the new model can reach a 
preferable fusion result no matter the sources of evidence are 
in high degree of conflict or not. Furthermore, the new 
algorithm offers a quick convergence and it is more 
appropriate to be used in the real fusion system. 
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Abstract. In this paper, we present a method based on belief functions 
to evaluate the quality of the optimal assignment solution of a classical 
association problem encountered in multiple target tracking applications. 
The purpose of this work is not to provide a new algorithm for solving 
the assignment problem, but a solution to estimate the quality of the 
individual associations (pairings) given in the optimal assignment solu- 
tion. To the knowledge of authors, this problem has not been addressed 
so far in the literature and its solution may have practical aspects for 
improving the performances of multisensor-multitarget tracking systems. 


Keywords: Data association; PCR6 rule; Belief function. 


1 Introduction 


Efficient algorithms for modern multisensor-multitarget tracking (MS-MTT) sys- 
tems [1, 2] require to estimate and predict the states (position, velocity, etc) of the 
targets evolving in the surveillance area covered by the sensors. The estimations 
and the predictions are based on sensors measurements and dynamical models 
assumptions. In the monosensor context, MTT requires to solve the data asso- 
ciation (DA) problem to associate the available measurements at a given time 
with the predicted states of the targets to update their tracks using filtering 
techniques (Kalman filter, Particle filter, etc). In the multisensor MTT context, 
we need to solve more difficult multi-dimensional assignment problems under 
constraints. Fortunately, efficient algorithms have been developed in operational 
research and tracking communities for formalizing and solving these optimal as- 
signments problems. Several approaches based on different models can be used 
to establish rewards matrix, either based on the probabilistic framework [1,3], 
or on the belief function (BF) framework [4-7]. In this paper, we do not focus on 
the construction of the rewards matrix!, and our purpose is to provide a method 
to evaluate the quality (interpreted as a confidence score) of each association 
(pairing) provided in the optimal solution based on its consistency (stability) 
with respect to all the second best solutions. 


1 We assume that the rewards matrix is known and has been obtained by a method 
chosen by the user, either in the probabilistic or in the BF framework. 
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The simple DA problem under concern can be formulated as follows. We have 
m > 1 targets T; (i = 1,...,m), and n > 1 measurements? zj (j = 1,...,n) 
at a given time k, and a m x n rewards (gain/payoff) matrix 2 = [w(i,J)| 
whose elements w(i, j) > 0 represent the payoff (usually homogeneous to the 
likelihood) of the association of target T; with measurement z;, denoted (Tj, z;). 
The data association problem consists in finding the global optimal assignment 
of the targets to some measurements by maximizing® the overall gain in such a 
way that no more than one target is assigned to a measurement, and reciprocally. 

Without loss of generality, we can assume w(i, j) > 0 because if some elements 
w(i, j) of R were negative, we can always add the same maximal negative value 
to all elements of 2 to work with a new payoff matrix Q’ = [w'(i,7)] having all 
elements w’ (i,j) > 0, and we get the same optimal assignment solution with 2 
and with Q’. Moreover, we can also assume, without loss of generality m < n, 
because otherwise we can always swap the roles of targets and measurements in 
the mathematical problem definition by working directly with Q¢ instead, where 
the superscript t denotes the transposition of the matrix. The optimal assignment 
problem consists of finding the m x n binary association matrix A = [a(i, j)] 
which maximize the global rewards R(Q, A) given by 


R(Q,A) 25° 


419 


w(i, j)a(i, j) (1) 
X; a,j) =1 (i=1,...,m) 
Subject to Jaan par G=lsan) (2) 
a(i,j) € {0,1} (¢=1,...,mand j=1,...,n) 


The association indicator value a(i, j) = 1 means that the corresponding 
target T; and measurement z; are associated, and a(i, j) = 0 means that they 
are not associated (i = 1,...,m and j =1,...,n). 

The solution of the optimal assignment problem stated in (1)—(2) is well 
reported in the literature and several efficient methods have been developed in 
the operational research and tracking communities to solve it. The most well- 
known algorithms are Kuhn-Munkres (a.k.as Hungarian) algorithm [8,9] and its 
extension to rectangular matrices proposed by Bourgeois and Lassalle in [10], 
Jonker-Volgenant method [11], and Auction [12]. More sophisticated methods 
using Murty’s method [13], and some variants [3, 14-19], are also able to provide 
not only the best assignment, but also the m-best assignments. We will not 
present in details all these classical methods because they have been already 
well reported in the literature [20, 21], and they are quite easily accessible on the 


? In a multi-sensor context targets can be replaced by tracks provided by a given 
tracker associated with a type of sensor, and measurements can be replaced by 
another tracks set. In different contexts, possible equivalents are assigning personnel 
to jobs or assigning delivery trucks to locations. 

3 In some problems, Q = [w(i,j)] represents a cost matrix whose elements are the 
negative log-likelihood of association hypotheses. In this case, the data association 
problems consists in finding the best assignment that minimizes the overall cost. 
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web. In this paper, we want to provide a confidence level (i.e. a quality indicator) 
in the optimal data association solution. More precisely, we are searching an 
answer to the question: how to measure the quality of the pairings a(i,j) = 1 
provided in the optimal assignment solution A? The necessity to establish a 
quality indicator is motivated by the following three main reasons: 


1. In some practical tracking environment with the presence of clutter, some 
association decisions (a(i, j) = 1) are doubtful. For these unreliable associ- 
ations, it is better to wait for new information (measurements) instead of 
applying the hard data association decision, and making potentially serious 
association mistakes. 

2. In some multisensor systems, it can be also important to save energy con- 
sumption for preserving a high autonomy capacities of the system. For this 
goal, only the most trustful specific associations provided in the optimal 
assignment have to be selected and used instead of all of them. 

3. The best optimal assignment solution is not necessarily unique. In such sit- 
uation, the establishment of quality indicators may help in selecting one 
particular optimal assignment solution among multiple possible choices. 


Before presenting our solution in Section 2, one must recall that the best, as well 
as the 2nd-best, optimal assignment solutions are unfortunately not necessarily 
unique. Therefore, we must also take into account the possible multiplicity of 
assignments in the analysis of the problem. The multiplicity index of the best 
optimal assignment solution is denoted 6, > 1, and the multiplicity index of the 
2nd-best optimal assignment solution is denoted {2 > 1, and we will denote the 
sets of corresponding assignment matrices by A, = fae ky =1..., 61} and 


by Ag = (abt) k2 =1..., 82}. The next simple example illustrates a case with 
multiplicity of 2nd-best assignment solutions for the reward matrix 1. 


Example: 3; = 1 and (2 = 4 (i.e. no multiplicity of Ay and multiplicity of A2) 
1 11 45 30 
QQ, = |17 8 38 27 
10 14 35 20 


This reward matrix provides a unique best assignment A; providing Rı (Q1, A1) = 
86, and 82 = 4 second-best assignment solutions providing R2(Q,,A5?) = 82 


(= = 1,2,3,4) given by 
0010 
A, = |0001 


0100 


0001 0010 0010 0001 
AS" = (0010), ACP = |1000|, AP = (0001), AP" = |1000 


0100 0001 1000 0010 
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2 Quality of the Associations of the Optimal Assignment 


To establish the quality of the specific associations (pairings) (i,j) satisfying 
aı(i, j) = 1 belonging to the optimal assignment matrix A, we propose to use 
both A, and 2nd-best assignment solution Ag. The basic idea is to compare the 
values a(i, j) with a2(i, j) obtained in the best and in the 2nd-best assignments 
to identify the change (if any) of the optimal pairing (i, j). Our quality indicator 
will depend on both the stability of the pairing and its relative impact in the 
global reward. The proposed method works also when the 2nd-best assignment 
solution Ag is not unique (as in our example). The proposed method will also 
help to select the best (most trustful) optimal assignment in case of multiplicity 
of A; matrices. 


2.1 A Simplistic Method (Method I) 


Before presenting our sophisticate method based on belief functions, let’s first 
present a simplistic intuitive method (called Method I). For this, let’s assume at 
first that Ay and A» are unique (no multiplicity occurs). The simplistic method 
uses only the ratio of global rewards p = R2(Q, A2)/Rı (2, A1) to measure the 
level of uncertainty in the change (if any) of pairing (i, j) provided in A; and A2. 
More precisely, the quality (trustfulness) of pairings in an optimal assignment 





solution A4, denoted* qz(i, j), is simply defined as follows for i = 1,...,m and 
J= Peer 
1, if ai(i,j) + aa(i,7) =0 
a(i,j)=41—p if ar(i,j) + a2(i,5) =1 (3) 
1, if. ai(i, j) + aa(i, 7) =2 


By adopting such definition, one commits the full confidence to the compo- 
nents (i, j) of Ay and Ag that perfectly match, and a lower confidence value (a 
lower quality) of 1 — p to those that do not match. To take into account the 
eventual multiplicities (when 82 > 1) of the 2nd-best assignment solutions A5?, 
k2 = 1,2,..., B2, we need to combine the Qz(Aı, A$?) values. Several methods 
can be used for this, in particular we can use either: 


— A weighted averaging approach: The quality indicator component qz (i, j) 
is then obtained by averaging the qualities obtained from each comparison 
of A, with A’ More precisely, one will take: 


B2 


arij) = $7 wA) ij) (4) 


k2=1 


where q}? (i, j) is defined as in (3) (with a(i, j) replaced by ab? (i, j) in 
the formula), and where w(A4?) is a weighting factor in [0,1], such that 
2e w(A4?) = 1. Since all assignments A}? have the same global reward 
value Ro, then we suggest to take w(A5?) = 1/62. A more elaborate method 


4 The subscript I in q7(i,j) notation refers to Method I. 
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would consist to use the quality indicator of AY? based on the 3rd-best 
solution, which can be itself computed from the quality of the 3rd assignment 
solution based on the 4th-best solution, and so on by a similar mechanism. 
We however don’t give more details on this due to space constraints. 


— A belief-based approach (see [22] for basics on belief functions): A second 
method would express the quality by a belief interval [¢g@™(i, j), ¢@°*(4, 7)] 
in [0,1] instead of single real number qz(i, j) in [0,1]. More precisely, one 
can compute the belief and plausibility bounds of the quality by taking 
(i,j) = Bel(ay(i,j)) = ming, qf? (i,j) and gP™(i,j) = Pl(ai(i,j)) = 
maX, qh? (i, j), with q}? (i, j) given by (3) and az(i, j) replaced by a5?(i, j) 
in the formula. Hence for each association a(i, j), one can define a basic 
belief assignment (BBA) mzj;(.) on the frame of discernment © ê {T = 
trustful, ~T = not trustful}, which will characterize the quality of the pairing 
(i, j) in the optimal assignment solution Aj, as follows: 


mall) =" (i, j) 
mig (AT) = 1 — ar* (4, j) (5) 
mi(T UAT) = gf * (i, j) — ar (4,9) 


Remark: In practice, only the pairings? (i,j) such that aı(i, j) = 1 are use- 
ful in tracking algorithms to update the tracks. Therefore, we don’t need to 
pay attention (compute and store) the qualities of components (i, j) such that 
ay (i, j) =0. 

2.2 A More Sophisticate and Efficient Method (Method II) 


The previous method can be easily applied in practice but it does not work very 
well because the quality indicator depends only on the p factor, which means that 
all mismatches between the best assignment A, and the 2nd-best assignment 
solution A» have their quality impacted in the same manner (they are all taken 
as 1 — p). As a simple example, if we consider the rewards matrix Qı given in 
our example, we will have p = Ro(9,, A5?)/Ri(Q1,A1) = 82/86 ~ 0.95, and 
we will get using method I with the weighting averaging approach (using same 
w(A5?) = 1/$ = 0.25 for k2 = 1,2,3,4) the following quality indicator matrix: 


1 & 1.0000 1.0000 0.5233 0.5233 
Q1(A1, A2) = z XO Qi(A1, A$?) = [0.5233 1.0000 0.7616 0.2849 (6) 
2 ko=l 0.7616 0.2849 0.7616 0.7616 


We observe that optimal pairings (2,4) and (3,2) get the same quality value 
0.2849 with the method I (based on averaging), even if these pairings have dif- 
ferent impacts in the global reward value, which is abnormal. If we use the 
method I with the belief interval measure based on (5), the situation is worst 
because the three optimal pairings (1,3), (2,4) and (3,2) will get exactly same 
belief interval values [0.0465,1]. To take into account, and in a better way, the 


5 given in the optimal solution found for example with Murty’s algorithm. 
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reward values of each specific association given in the best assignment A, and 
in the 2nd-best assignment A», we propose to use the following construction of 
quality indicators depending on the type of matching (called Method II): 


— When a(i, j) = as? (i, j) = 0, one has full agreement on “non-association” 
(T;,2;) in Ay and in A$? and this non-association (T;, zj) has no impact on 
the global rewards values R,(§2, A1) and R2(Q, AS?), and it will be useless. 
Therefore, we can set its quality arbitrarily to ay? (4, j= 

When a(i, j) = až? (i,j) = 1, one has a full agreement on the association 
(T;, zj) in Ay and in A}? and this association (T;, z;) has different impacts in 
the global rewards values Rı (2, A1) and R(Q, AS), To qualify the quality 
of this matching association (T;, zj), we define the two BBA’s on X = (T;, z;) 
and X U7X (the ignorance), for s = 1, 2: 

e = as(i, j) -w(i, j)/Rs (2, As) (7) 
ms(X U =X) = 1—ms(X) 


Applying the conjunctive rule of fusion, we get 


te = mi(X)m2(X) + mi(X )m2(X t aX) + mi(X U aX )m2(X) (8) 


m(X UX) = mi(X U aX )m2(X U ~X) 


Applying the pignistic transformation’ [24], we get finally BetP(X) = m(X)+ 

4- m(X U7X) and BetP(>X) = $-m(X U ~X). Therefore, we choose the 

quality indicator as qf? (i, j) = BetP(X). 

When a(i, j) = 1 and a¥?(i, j) = 0, one has a disagreement (conflict) on 

the association (Tj, zj) in Ay and in (Tj, zj) in A4?, where jo is the mea- 

surement index such that ag(i, j2) = 1. To qualify the quality of this non- 

matching association (T;, zj), we define the two following basic belief assign- 

ments (BBA’s) of the propositions X = (T;,z;) and Y = (T;, z;,) 

ene =a) ety ony ae = aati) pee gy 
mi(X UY) =1—mi(X) m2(X UY) =1—m2(Y) 


Applying the conjunctive rule, we get m(X NY = 0) = mi(X)mo2(Y) and 


m(X) = m1(X)m2(X UY) 
mi(X UY)ma(Y) (10) 
m(X UY) =mi(X UY)me2(X UY) 


3 
5 
l 


Because we need to work with a normalized combined BBA, we can choose 
different rules of combination (Dempster-Shafer’s, Dubois-Prade’s,Yager’s 


® We have chosen here BetP for its simplicity and because it is widely known, but 


DSmP could be used instead for expecting better performances [23]. 
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rule [23], etc). In this work, we recommend the Proportional Conflict Redis- 
tribution rule no. 6 (PCR6), proposed originally in DSmT framework [23], 
because it has been proved very efficient in practice. So, we get with PCR6: 


M(X) = mx(X)ma(X UY) + m (X): aO 


m(Y) = m (X UY)ma(¥) + m(X) - ER (11) 


m(X UY) =m (XUY)m(XUY) 


Applying the pignistic transformation, we get finally BetP(X) = m(X)+4- 
m(X UY) and BetP(Y) = m(Y) + $-m(X UY). Therefore, we choose the 
quality indicators as follows: q}? (i,j) = BetP(X), and q}? (i, j2) = BetP(Y). 


The absolute quality factor Qabs (A1, AS?) of the optimal assignment given in 
A, conditioned by A5?, for any kə € {1,2,..., 82} is defined as 


Qavs(A1,A5?) EX Y a(i, far? (i) (12) 
i=1 j=l 
Example (continued): If we apply the Method II (using PCR6 fusion rule) to 
the rewards matrix Qı, then we will get the following quality matrix (using 
weighted averaging approach) 


1 & 1.0000 1.0000 0.7440 0.7022 
Qr1(A1, A2) = = X. Qri(Ai, A$?) = [0.7200 1.0000 0.8972 0.5753 
Bo ~~ 0.8695 0.4957 0.9119 0.8861 


with the absolute quality factors Qavs (Ar, A3?™=") ~ 1.66, Qavs (A1, AP) X 
1.91, Qavs (A1, A$?) ~ 2.19, Qavs (A1, AS?) ~ 1.51. Naturally, we get 


Qavs (A1, AY") > Qavs (A1, AP) > Qavs (A1, AP) > Qavs (A1, AFT") 


because A; has more matching pairings with A= than with other 2nd-best 
assignment A$? (k2 # 3), and those pairings have also the strongest impacts in 
the global reward value. One sees that the quality matrix Qzz differentiates the 
qualities of each pairing in the optimal assignment A, as expected (contrari- 
wise to Method I). Clearly, with Method I we obtain the same quality indicator 
value 0.2849 for the specific associations (2,4) and (3,2) which seems intuitively 
not very reasonable because the specific rewards of these associations impact 
differently the global rewards result. If the method II based on the belief in- 
terval measure computed from (5) is preferred’, we will get respectively for the 
three optimal pairings (1,3), (2,4) and (3,2) the three distinct belief interval 
(0.5956,0.8924], [0.4113,0.7699] and [0.3524,0.6529]. These belief intervals show 
that the ordering of quality of optimal pairings (based either on the lower bound, 
or on the upper bound of belief interval) is consistent with the ordering of qual- 
ity of optimal pairings in Qzz(A1, A2) computed with the averaging approach. 
Method II provides a better effective and comprehensive solution to estimate the 


quality of each specific association provided in the optimal assignment solution 
Ay. 


T just in case of multiplicity of second best assignments. 
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3 Conclusion 


In this paper we have proposed a method based on belief functions for estab- 
lishing the quality of pairings belonging to the optimal data association (or as- 
signment) solution provided by a chosen algorithm. Our method is independent 
of the choice of the algorithm used in finding the optimal assignment solution, 
and, in case of multiple optimal solutions, it provides also a way to select the 
best optimal assignment solution (the one having the highest absolute quality 
factor). The method developed in this paper is general in the sense that it can be 
applied to different types of association problems corresponding to different sets 
of constraints. This method can be extended to SD-assignment problems. The 
application of this approach in a realistic multi-target tracking context is under 
investigations and will be reported in a forthcoming publication if possible. 
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Comparison of Identity Fusion Algorithms Using 
Estimations of Confusion Matrices 


G. Golino 


A. Graziano 
A. Farina 
W. Mellano 
F. Ciaramaglia 


Abstract— Scope of this paper is to investigate the 
performances of different identity declaration fusion algorithms in 
terms of probability of correct classification, supposing that the 
information for combination of the inferences from the different 
classifier is affected by measurement errors. In particular, these 
information have been assumed to be provided in the form of 
confusion matrices. Six identity fusion algorithms from literature 
with different complexity have been included in the comparison: 
heuristic methods such as voting and Borda Count, Bayes’ and 
Dempster-Shafer’s methods and the Proportional Redistribution 
Rule n° 1 in the Dempster-Shafer’s framework. 


Keywords—target classification, confusion 


matrix. 


identity fusion, 


I. INTRODUCTION 


In a multi-sensor system the target classification performance 
can be improved by suitably combining the inferences 
generated by the autonomous classifiers of the single sensors 
(identity declaration fusion [1]). For this purpose it is desirable 
to use the available information about the classification 
performances of the single sensors. The confusion matrix, 
whose elements correspond to the likelihood of the different 
involved classes, is a compact and detailed way of 
representing the classification performance, from which the 
Probability of correct classification (Pcc) and the probability 
relative to the various misclassification errors can be derived. 
In particular the elements of the confusion matrix can be used 
to maximize the a-posterior Pec according to Bayes’ theory. In 
this case, if the numerical values of the confusion matrix were 
errorless, the performance of the identity fusion would be 
optimal. However, in practice, these values are estimated and 
affected by errors. In these conditions, the Bayes’ rule does 
not always produce best results. In particular, in presence of 
strong-conflicting inferences and estimation errors, the 
application of Bayes’ rule can be not effective. It can be better 
to apply simpler combination rules as some heuristic methods 
that are more robust to errors. 

Dempster-Shafer’s theory has been presented as a 
generalization of Bayes’ theory in [2]. A recent work has 
disputed this claim, limiting its correctness to the case of 
uniform a-prior probabilities [3]. The problem of the 
Dempster-Shafer rule (and of Bayes’ rule) in presence of 
conflicting inferences has been pointed out by the well-known 
Zadeh’s paradox [4]. In this paper the performances of 
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different algorithms that use estimated confusion matrices 
affected by errors to combine the inferences from the single 
classifiers are investigated. In particular the heuristic methods 
based on voting and on ranking (Borda count) are compared 
with the methods using Bayes’ and Dempster-Shafer’s rules. 
Moreover the effectiveness of the redistribution of the 
conflicting masses preserving the Dempster-Shafer 
framework, like Proportional Redistribution Rules (PCR) is 
evaluated. 


The paper is organized as follows: 

e in section II the identity fusion algorithms considered 
in this paper are briefly described; 

e in section III, four simple but representative identity 
fusion problems are introduced as study cases and the 
corresponding results using different mean values of 
the estimation errors of the confusion matrices are 
reported and commented; 


e section IV gives the conclusions. 


II. IDENTITY FUSION ALGORITHMS 


The algorithms for the identity fusion considered in this paper 
are: 


e Majority Voting (MV), 
e Weighted Voting (WV), 
e Borda count, 

e Bayes’ rule, 


e Dempster-Shafer’s (D-S) rule with the following basic 
belief mass assignment: “q-least commitment”, 


e Proportional Redistribution Rule n°1 (PCR1) with the 
following basic belief mass assignment: “q-least 
commitment”. 


A brief description of the fusion algorithms follows. 


Majority voting [5] is the simplest method for the 
combination of inferences: each inferred class corresponds to a 
single vote and the selected class after fusion is the most voted 
class: all the inferences matter the same. In the modified 
weighted version, the different votes are weighted by the 
estimated Pcc of the voter/classifier. 


Voting methods use only the top choice of each classifier, 
but secondary choices often contain near misses that should not 
be overlooked. The Borda count [5] is a method in which 
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classes are ranked in order of preference; it gives each class a 
certain number of points corresponding to the position in which 
it is ranked by each classifier. The class with the highest 
scoring is then selected after fusion. 


Bayes’ theorem [1,4] links the degree of belief in a 
proposition before and after accounting for evidence (a-priori 
and a-posteriori probabilities). The a-posteriori probability of a 
combination of two or more evidences is obtained by the 
multiplication of the likelihoods of the single evidences (the 
independence of the evidences is assumed). 


Dempster-Shafer theory [1,4 and 5] allows to specify a 
degree of ignorance instead of being forced to supply 
probabilities that add to unity. In this formalism a degree of 
belief (also referred to as a Basic Belief Mass - BBM) is used 
rather than a Bayesian probability distribution. BBM values are 
assigned to sets of possibilities (union of one or more classes) 
rather than to a single class, probability is instead represented 
by intervals that are lower-bounded by the value “belief” (or 
“support”) and upper-bounded by the value “plausibility”. 
BBM values from different sources can be combined with 
Dempster-Shafer's rule of combination, assuming independent 
belief sources. There are more than one possible assignment for 
transforming probabilities into BBMs [7,8 and 9]. The “q-least 
commitment” basic belief mass assignment (that corresponds 
to the maximum compatible degree of ignorance) has been 
considered in this paper to transform the CM (Confusion 
Matrix) likelihoods and the a-priori probabilities into BBM 
values. 


Proportional Redistribution Rules (PCR) is a family of 
fusion rules for the combination of uncertain information 
allowing to deal with highly conflicting sources. The PCR rules 
can be used as alternatives to the Dempster-Shafer's 
combination rule. Six PCR rules (PCRI-PCR6) have been 
defined [10,11 and 12]: from PCR1 up to PCR6 one increases 
in one hand the complexity of the rules, but in other hand one 
improves the accuracy of the redistribution of conflicting 
masses. The basic common principle of PCR rules is to 
redistribute the conflicting mass proportionally with some 
functions depending on the sum of the masses assigned by the 
single inferences. PCR1 is the least accurate combination rule 
of the PCR family, but it is the simplest to implement and it has 
been considered in this paper. PCR2-6 implementations are 
significantly more complex because the conflicting mass is 
redistributed only to the non-empty set that are involved in the 
conflict (extra computer memory is needed to keep track of the 
conflicting hypotheses and extra computation load is needed 
for combining them). A particular interesting action point for 
further investigation would be testing the most efficient PCR 
tule (PCR6) [12]. 


A. Combination rules 


In this section, the rules of Bayes, Dempster-Shafer and 
PCR1 for the combination of two classifiers are briefly 
recalled. For further details and the generalization of the rules 
with more than two classifiers, see references [1], [4], [10] and 
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[11]. Voting and Borda count combinations are not considered 
here because they consists simply in the sums of respectively 
the votes and the ranks. 


Let consider a set Q of possible exhaustive and mutually 
exclusive classes C, , with N being the cardinality of this set: 


(1) 


Let suppose that the independent classifiers 1 and 2 infer 
respectively the classes C; and C;; the a-posterior probability 


Q ={C, C3 Cy} 


P(C [AO B) of inferring the class C, resulting from Bayes’ 
rule of combination is: 
B(c,/c,)-B{C,/C,) P(C.) 


N 


SAG) Ale) AE) 


h=1 


(2) 





Pslee/C, ac,)= 


where: 
Py (-) is the a-prior probabilities (without any 


information obtained by previous classifications) of 
the considered class; 


P(C,/C,), P(C,/C,) are the probabilities that 
classifiers 1 and 2 infer the class C, assuming that 
the true class is C, (likelihoods). 

Let consider the power set 2° of Q as the set whose 
elements are all the possible subsets of Q : 

2° = 1h Fy CO} = {W,C,,Cy,-+-5 Cy, O Crs Cy O Cys O) 

(3) 
where Ø is the empty set. The cardinality of 2° is 2”. 


Let suppose that the independent classifiers 1 and 2 assign 
respectively BBMs m,(-) and m,(-) to the elements included 


in the power set 2°; the combination BBM m; (F,) of F, 
resulting from Dempster-Shafer’s rule of combination is: 


m,(F;)-m,(F;) (4) 


my(F,) = im 


i, jl Fy =F, OF; 
where m, is the global conflicting mass, defined as follow 


yim (F, )-m,(F,) 


pq! Fy OF, =8 


m (5) 


c 


In the case of the PCR1 rule the combination BBM 
m (F,) of F, is instead: 
‘d+ m(F,) + my(F; ) 
2" 


> (m (F,)+ m,(F;,)) 


h=1 


-me 


mp (Fi) = 
i,j/ FOF; =F, +ø 


(6) 
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HI. STUDY CASES 


Four simple but representative study cases (three different 
classifiers for a three classes problem) have been investigated, 
as follows: 


e complementary confusion matrices, 


e supplementary confusion matrices, 


e complementary conflicting confusion matrices, 


e supplementary conflicting confusion matrices. 


By complementary CMs it has been meant that the single 
classifiers show a complementary expertise in the recognition 
of the different classes. By supplementary CMs the different 
classifiers show similar behaviors. By conflicting CMs a 
possible overestimation of the performance of the single 
classifiers can make harder an effective combination of the 
contradictory inferences from different classifiers when they 
occur. A quantitative definition of complementary and 
supplementary CM can be found in [13]. 


In the following sub-sections, the estimated confusion 
matrices that have been selected for the four study cases are 
reported. The columns of the matrices represent the true 
classes, while the rows correspond to the inferred classes, so 
the element (k,h) of a matrix is an estimation of the probability 
of declaring k" class when the true class is the h” one: 

M ,(k,h)= P(D=k/T =h) (7) 

The a-prior probabilities of the different classes are 
assumed equal. A block diagram of the fusion system is shown 
in fig. 1. 


The performances of the six algorithms in correspondence 
of the identity fusion of six inferences (two independent 
inferences for each classifier) have been considered. The 
performances of the different algorithms have been computed 
with 1000 Monte Carlo trials, each generating independent 
samples of the true confusion matrices and a-prior 
probabilities. 


The results of the Monte Carlo trials are represented by the 
curves corresponding to Empirical Cumulative Distribution 
Function (ECDF) versus the Pcc. The x-axis values of Pcc have 
been computed exactly, that is the contribution of all the 
possible permutations of the single sensor inferences has been 
considered. 
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A. Complementary confusion matrices 


The following confusion matrices corresponding to three 
different classifiers have been considered: 


0.90 0.30 
M,=|0.10 0.40 
0 0.30 
[0.40 0.30 
0.30 0.40 
10.30 0.30 
[0.40 0 
0.30 0.90 
10.30 0.10 


0.30 
0.30 
0.40 
0.10 | 
0 
0.90 | 
0.30 | 
0.30 
0.40 | 


(8) 








The performance is dependent on the true target class (class 
1, 2 or 3): 


e the first classifier identifies correctly targets 
belonging to class 1 (on average it makes only one 
mistake in ten of its inferences), while it almost 
randomly infers in correspondence of targets 


belonging to class 2 or class 3, 


the second classifier identifies correctly targets 
belonging to class 3 (on average it makes only one 
mistake in ten of its inferences), while it almost 
randomly infers in correspondence of targets 
belonging to class 1 or class 2, 


the third classifier identifies correctly targets 
belonging to class 2 (on average it makes only one 
mistake in ten of its inferences), while it almost 
randomly infers in correspondence of targets 
belonging to class 1 or class 2. 


The performances of the six different algorithms are 
reported in the fig. 2 and 3 in correspondence of an estimation 
of the confusion matrix by using respectively 30 and 10 
independent samples for each class. The performances of the 
single classifiers correspond to the dotted curves (indicated as 
C1, C2 and C3 in the legends of the figures). Bayes’ rule, 
Dempster-Shafer’s rule, Borda count and PCR1 give similar 
results, the performance of PCR1 is barely the best. The voting 
algorithms present significantly worse performance. 
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B. Supplementary confusion matrices 


The following confusion matrices corresponding to three 
different classifiers have been considered: 


0.70 
M, =| 0.20 
0.10 
[ 0.80 
0.10 
| 0.10 
0.60 
0.20 
| 0.20 


0.10 
0.70 
0.20 
0.10 
0.80 
0.10 
0.20 
0.60 
0.20 


0.20 
0.10 
0.70 
0.10 | 
0.10 
0.80 | 
0.20 | 
0.20 

0.60 | 


(9) 








In the second example, three classifiers with supplementary 
confusion matrices have been selected. A single classifier can 
recognize all the three classes with the same accuracy, but the 
accuracy differs from classifier to classifier: 


e the first classifier has an estimated probability of 
correct classification equal to 70%, 


e the second classifier has an estimated probability 
of correct classification equal to 80%, 


e the third classifier has an estimated probability of 
correct classification equal to 60%. 


The performances of the six different algorithms are 
reported in the fig. 4 and 5 in correspondence of an estimation 
of the confusion matrix by using respectively 30 and 10 
independent samples for each class. The performances of the 
single classifiers correspond to the dotted curves (indicated as 
C1, C2 and C3 in the legends of the figures). All the algorithms 
give comparable performance. PCRI and Bayes’ rule 
performance are exactly the same and they are slightly better 
than the others, weighted voting performs better than Borda 
count and majority voting. 


C. Complementary conflicting confusion matrices 


The following confusion matrices corresponding to three 
different classifiers have been considered: 


1.00 
M, =| 0.00 
0.00 
| 0.60 
0.00 
| 0.40 
[0.60 
0.40 
| 0.00 


0.00 
0.60 
0.40 
0.00 
1.00 
0.00 
0.40 
0.60 
0.00 


0.00 
0.40 
0.60 
0.40 | 
0.00 
0.60 | 
0.00 | 
0.00 

1.00 | 


M, (10) 


M, 
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In this third example, the three classifiers can be affected by 
conflicting inferences. As consequence, the application of 
Bayes’ rule to fusion leads to severe performance degradation 
with respect to heuristic methods. The problem arises from an 
overestimation of the performance of the single classifiers. 


The performances of the six different algorithms are 
reported in the fig. 6 and 7 in correspondence of an estimation 
of the confusion matrix by using respectively 30 and 10 
independent samples for each class. The performances of the 
single classifiers correspond to the dotted curves (indicated as 
C1, C2 and C3 in the legends of the figures). PCR1 gives the 
best result that is slight better than Borda count. Majority and 
weighted voting have coincident performance that are 
significantly worse than the one of PCR1. The performance of 
Bayes’ rule and Dempster-Shafer rule are perfectly coincident 
and worse than all the others because of the presence of 
conflicting inferences. 


D. Supplementary conflicting confusion matrices 


The following confusion matrices corresponding to three 
different classifiers have been considered: 


1.00 0.00 0.00 
M,=M,=M,=|0.00 1.00 0.00 
0.00 0.00 1.00 


(11) 


In the forth example, three classifiers with identity 
confusion matrices as estimations have been selected: 
according to these estimations the single classifier is never 
wrong. If the classifiers disagree on the inferred class, the 
Bayes’ rule of fusion leads to severe performance degradation 
with respect to heuristic methods. 


The performances of the six different algorithms are 
reported in the fig. 8 and 9 in correspondence of an estimation 
of the confusion matrix by using respectively 30 and 10 
independent samples for each class. The performance of the 
single classifiers correspond to the dotted curves (indicated as 
C1, C2 and C3 in the legends of the figures). The performances 
of the voting algorithms, Borda count and PCR1 are perfectly 
coincident and near to 100%. The performances of Bayes’ rule 
and Dempster-Shafer rule are perfectly coincident and much 
worse than all the others because of the presence of conflicting 
inferences, even much worse than the performance of the 
single classifiers. 


E. Summary results 


In table I the average (over the 1000 Monte Carlo trials) 
Pcc is reported for all the investigated study cases. It has been 
reported also an intermediate case where 60 samples (20 
samples per class) for the estimation of each confusion matrix 
have been considered. It can be noted than PCR1 always brings 
the highest Pcc of all the six considered combination rules. 
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IV. CONCLUSIONS 


Simulation results show that in the considered study cases, 
the algorithm using the PCR1 rule of combination brings the 
best performance of all the six considered alternatives and 
definitely overcomes Bayes’ and D-S’s rules in the cases where 
the probability of conflicts between the inferences is high. This 
performance difference increases with the decrease of the 
number of samples used for the estimation of the confusion 
matrices. This behavior is a consequence of the poor 
performance of the latter two combination methods in presence 
of conflicting inferences from the different classifier, as 
claimed by the Zadeh’s paradox. In two investigated study 


Dempster-Shafer rules perform even worse than heuristic 
approaches. 


In the cases where the conflict is less likely probable the 
performance of the PCR1 is comparable with the ones of 
Bayes’ and D-S’s rules (the same or slightly better). 


The implementation of PCRI slightly increases the 
computational complexity of D-S’s rule. Future work may be 
addressed to the comparison of the performance resulting by 
the application of more complex PCR rules to the inferences of 
classifiers whose accuracies are represented by conflicting 
confusion matrices. 
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Fig. 4. ECDF versus Pcc with supplementary confusion matrices 
(estimation from 30 samples per class). 
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TABLE I. SUMMARY RESULTS (N IS THE TOTAL NUMBER OF SAMPLES). 

PROBABILITY OF CORRECT CLASSIFICATION (mean value, %) 
Study MAJORITY WEIGHTED BORDA BAYES DEMPSTER PCRI 
cases VOTING VOTING COUNT SHAFER 

N=30 | N=60 | N=90 | N=30 | N=60 | N=90 | N=30 | N=60 | N=90 | N=30 | N=60 | N=90 | N=30 | N=60 | N=90 | N=30 | N=60 | N=90 

Compl. 67.8 | 72.4 | 73.9 | 67.8 | 72.4 | 73.9 | 73.9 | 78.9 | 80.7 | 73.1 | 79.3 | 81.6 | 73.0 | 79.4 | 81.6 | 74.8 | 80.5 | 82.4 
CMs 
Supp. 82.3 88.4 | 89.9 | 82.7 | 87.4 | 88.8 | 84.3 90.4 | 84.0 | 88.7 90.4 
CMs 
Compl. 87.7 | 92.9 | 94.8 | 87.7 | 92.9 | 94.8 | 94.5 | 98.1 | 99.0 | 70.7 | 81.5 | 86.5 | 70.7 | 81.5 | 86.5 : : 99.2 
confl. 
CMs 
Supp. 98.8 | 99.8 98.8 | 99.8 | 99.9 | 98.8 | 99.8 | 99.9 | 62.5 | 75.5 | 81.8 | 62.5 | 75.5 | 81.8 | 98.8 | 99.8 | 99.9 
confl. 
CMs 





Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


Reliability and Importance 
Discounting of Neutrosophic Masses 


Florentin Smarandache 


Originally published in Smarandache, F. - Neutrosophic Theory and its 
Applications, Collected Papers, Vol. I. 2014. <hal-01092887v2>, and 
reprinted with permission. 


Abstract. In this paper, we introduce for the first time the discounting of a 
neutrosophic mass in terms of reliability and respectively the importance of 


the source. 
We show that reliability and importance discounts commute when 


dealing with classical masses. 


1. Introduction. Let © = {®,, ®,,...,®,} be the frame of discernment, 


where n = 2, and the set of focal elements: 
F = {A,,A>,..., Am}, form > 1,F c G®. (1) 
Let G? = (@,U,N, C) be the fusion space. 
A neutrosophic mass is defined as follows: 
M,:G > [0,1]? 
for any x E€ G, m (x) = (t(x), iC), f(x)), (2) 
where t(x) = believe that x will occur (truth); 
i(x) = indeterminacy about occurence; 


and f (x) = believe that x will not occur (falsity). 
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Simply, we say in neutrosophic logic: 
t(x) = believe in x; 


i(x) = believe in neut(x) 


[the neutral of x, i.e. neither x nor anti(x)]; 
and f (x) = believe in anti(x) [the opposite of x]. 
Of course, t(x), i(x), f(x) € [0,1], and 
Yxeclt(x) + i(x) + f(x)] = 1, (3) 
while 
Mn() = (0,0, 0). (4) 


It is possible that according to some parameters (or data) a source is 
able to predict the believe in a hypothesis x to occur, while according to other 
parameters (or other data) the same source may be able to find the believe 
in x not occuring, and upon a third category of parameters (or data) the 
source may find some indeterminacy (ambiguity) about hypothesis 


occurence. 
An element x € G is called focal if 
Nm (x) = (0,0, 0), (5) 
i.e. t(x) > Oori(x) > Oor f(x) > 0. 
Any classical mass: 
m : G? > [0,1] (6) 
can be simply written as a neutrosophic mass as: 


m(A) = (m(A), 0, 0). (7) 


258 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


2. Discounting a Neutrosophic Mass due to Reliability of the 


Source. 


Let a = (@;,@2,@3) be the reliability coefficient of the source, a € 
[0,1]%. 


Then, for any x € G? \ {0, I}, 
where 0 = the empty set 
and I, = total ignorance, 
My(X)q = (at (x), azi(x), asf (x)), (8) 


and 


Malda =| t+ 0-a) > te, 
xEGO\ {Ie} 


D+- Š OS-a) DY fH 
xEG9\ {It} xEG?\{P, It} 


(9), 


and, of course, 


Mn()a oa (0, 0, 0). 


The missing mass of each element x, for x + @,x + I,, is transferred to 


the mass of the total ignorance in the following way: 
t(x) — a,t(x) = (1 — @,) : t(x) is transferred to t(/,), (10) 
i(x) — azi(x) = (1 — æ) : i(x) is transferred to i(/,), (11) 


and f(x) — azf(x) = (1 — a3): f (x) is transferred to f (I+). (12) 
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3. Discounting a Neutrosophic Mass due to the Importance of the 


Source. 


Let 6 € [0, 1] be the importance coefficient of the source. This discounting 


can be done in several ways. 


a. For any x € G? \ {}, 


Mn(x)p, = (B+ t(x), i(x), fe) + (1 — L): tœ), (13) 

which means that t(x), the believe in x, is diminished to £ - t(x), and the 
missing mass, t(x) — 6: t(x) = (1 — £) - t(x), is transferred to the believe in 
anti(x). 

b. Another way: 

For any x E€ G? \ {d}, 

Min (x)g, = (B+ t(x), i(x) + (1 — L): tx), F%)), (14) 
which means that t(x), the believe in x, is similarly diminished to £ : t(x), 
and the missing mass (1 — f)-t(x) is now transferred to the believe in 
neut (x). 
c. The third way is the most general, putting together the first and second 


ways. 
For any x E€ G? \ {d}, 


MAO = (B Ei + A- B) te) y, fE + AaB) te): 
d=) 


where y E [0, 1] is a parameter that splits the missing mass (1 — £): t(x) a 
part to i(x) and the other part to f (x). 


For y = 0, one gets the first way of distribution, and when y = 1, one 


gets the second way of distribution. 
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4. Discounting of Reliability and Importance of Sources in General 
Do Not Commute. 


a. Reliability first, Importance second. 
For any x € G? \ {¢, I+}, one has after reliability a discounting, where 


a = (1, Q2, @3): 
Mn(X)a = (a, : t(x), Qa ` t(x), Q3 ` fœ), (16) 


and Malle =( t+- S tiA- ar) 
xEGO\ {11} 


> i@.fG)+-a) Yo Fe 
xEG*\{h, It} xEG9\ {Plt} 


af (Ty, 11, Fi, )- (17) 


Now we do the importance f discounting method, the third importance 


discounting way which is the most general: 
Mn(X)ap, = (Pat), agi(x) + (1 — B)ayt(x)y, asf (x) 
+U=~ut@ja=y)) (18) 


and 


Mplldag, = (B Tro Int O- BT, y Fapt A-BAT, -y)). (19) 
b. Importance first, Reliability second. 


For any x € G® \ {@, J;}, one has after importance ß discounting (third way): 
mn(X) p= (B t(x), (x) + (1 — DOY, FO) + A- BEGG - y)) (20) 
and 
male, = (B < t(l) i) + 1- PEY, fe) + A - PUAA- 7). 
(21) 
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Now we do the reliability a = (a1, a2, a3) discounting, and one gets: 
Mn(X) pea = (a1 B+ t(x), a2 U(x) + a2(1— Pty, az + f(x) + az ° 
(1 -= PA — y)) (22) 


and 
Mn (It) gza = (a, "Pot a2 il) +a, - Pty, a3 f) + 
a3(1— B)t(I,)(1 — y)). (23) 


Remark. 


We see that (a) and (b) are in general different, so reliability of sources 


does not commute with the importance of sources. 


5. Particular Case when Reliability and Importance Discounting of 
Masses Commute. 


Let's consider a classical mass m:G® = [0,1] (24) 
and the focal set F c G9, F = {Ay,A3,...,A4m},m = 1, (25) 
and of course m(A;) > 0, for1 <i <m. 
Suppose m(A;) = a; € (0,1]. (26) 
a. Reliability first, Importance second. 
Let a € [0,1] be the reliability coefficient of m (+). 
For x € G? \ {¢,1,},onehas M(X)g =a-m(x), (27) 
and m(/,) = a:m(U,) + 1 -— a. (28) 
Let $ € [0,1] be the importance coefficient of m (+). 
Then, for x E€ G? \ {¢, Ig}, 
m(x)ap = (Bam(x),am(x) — Bam(x)) = a+ m(x) : (B,1—B), (29) 
considering only two components: believe that x occurs and, respectively, 


believe that x does not occur. 


Further on, 
MU )ag = Bam) + P — Ba,amU,) +1—-—a—famU,) -P + Ba) = 
lam(Z,) +1- a]: (6,1 — £). (30) 
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b. Importance first, Reliability second. 
For x E€ G? \ {¢, I+}, one has 
m(x)g = (B : m(x), m(x) — P : m(x)) = m(x) - (6,1 — L), (31) 
and m(I,)g = (BM), M) — EmU) = mM) : (B,1 — L). (32) 
Then, for the reliability discounting scaler a one has: 
M(x) ga = am(x)(B, 1 — B) = (am(x)B, am(x) — afm(m)) (33) 


and m(Ir) gq = «> MU, 1- B) + A — a), 1- B) = [amU,) +1- a]: 
(8,1— P) = (am(I,)B,amC;) — am) B) + (£ — af, 1-a—B + af) = 
(aBm(1,) + B — aB,amU,) — aBmU,) + 1— a — B — af). (34) 


Hence (a) and (b) are equal in this case. 


6. Examples. 


1. Classical mass. 


The following classical is given on 0 = {A,B}: 


A B AUB 


m 0.4 0.5 
(35) 


Let a = 0.8 be the reliability coefficient and f = 0.7 be the importance 
coefficient. 


a. Reliability first, Importance second. 


A B AUB 
Me 0.32 0.40 0.28 
Map (0.224, 0.096) (0.280, 0.120) (0.196, 0.084) 


(36) 
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We have computed in the following way: 
M,(A) = 0.8m(A) = 0.8(0.4) = 0.32, (37) 


M(B) = 0.8m(B) = 0.8(0.5) = 0.40, (38) 

M, (AUB) = 0.8(AUB) + 1 — 0.8 = 0.8(0.1) + 0.2 = 0.28, (39) 
and ™Mgp(B) = (0.7m,(A), mg(A) — 0.7m, (A) ) = (0.7(0.32), 0.32 — 

0.7(0.32)) = (0.224, 0.096), (40) 
Map (B) = (0.7mg(B), mg (B) — 0.7M (B)) = (0.7(0.40), 0.40 — 
0.7(0.40)) = (0.280, 0.120), (41) 
Mep(AUB) = (0.7mg (AUB), mg (AUB) — 0.7my(AUB)) = 
(0.7(0.28), 0.28 — 0.7(0.28)) = (0.196, 0.084). (42) 


b. Importance first, Reliability second. 


A B AUB 
m 0.4 0.5 0.1 
mg (0.28, 0.12) (0.35, 0.15) (0.07, 0.03) 
(0.224,0.096  (0.280,0.120) (0.196, 0.084) 
(43) 


Mea 
We computed in the following way: 
mg(A) = (Bm(A), (1 — B)m(A)) = (0.7(0.4), (1 — 0.7)(0.4)) = 
(0.280, 0.120), (44) 
mg(B) = (Bm(B), (1 — B)m(B)) = (0.7(0.5), (1 — 0.7)(0.5)) = 
(0.35, 0.15), (45) 
mg(AUB) = (Bm(AUB), (1 — B)m(AUB)) = (0.7(0.1), (1 — 0.1)(0.1)) = 

(0.07, 0.03), (46) 

and mgq(A) = amg (A) = 0.8(0.28, 0.12) = (0.8(0.28), 0.8(0.12)) = 
(0.224, 0.096), (47) 

Mgq(B) = amg(B) = 0.8(0.35, 0.15) = (0.8(0.35), 0.8(0.15)) = 
(0.280, 0.120), (48) 
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Mgq(AUB) = am(AUB)(B, 1 — B) + (1 — a)(B,1 — B) = 0.8(0.1)(0.7,1 — 
0.7) + (1 — 0.8)(0.7, 1 — 0.7) = 0.08(0.7, 0.3) + 0.2(0.7, 0.3) = 
(0.056, 0.024) + (0.140, 0.060) = (0.056 + 0.140, 0.024 + 0.060) = 
(0.196, 0.084). (49) 


Therefore reliability discount commutes with importance discount of 


sources when one has classical masses. 


The result is interpreted this way: believe in A is 0.224 and believe in 
nonA is 0.096, believe in B is 0.280 and believe in nonB is 0.120, and believe 


in total ignorance AUB is 0.196, and believe in non-ignorance is 0.084. 


7. Same Example with Different Redistribution of Masses Related to 


Importance of Sources. 


Let’s consider the third way of redistribution of masses related to 
importance coefficient of sources. 6 = 0.7, but y = 0.4, which means that 
40% of ß is redistributed to i(x) and 60% of p is redistributed to f(x) for 
each x € G? \ {o}; anda = 0.8. 


a. Reliability first, Importance second. 


A B AUB 
m 0.4 0.5 0.1 
Ma 0.32 0.40 0.28 
Map (0.2240, 0.0384, (0.2800, 0.0480, (0.1960, 0.0336, 
0.0576) 0.0720) 0.0504). 


(50) 


265 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


We computed m, in the same way. 


But: 


Map (A) = (B : M,(A), ig (A) + (1 <= B)m,(A) Y, fa(A) + 
(1 — B)m,(A)(1 — y)) = (0.7(0.32),0 + (1 — 0.7)(0.32)(0.4),0 + 
(1 — 0.7)(0.32)(1 — 0.4)) = (0.2240, 0.0384, 0.0576). (51) 


Similarly for mgg (B) and Mag (AUB). 


b. Importance first, Reliability second. 


A B AUB 
m 0.4 0.5 0.1 
mg (0.280, 0.048, (0.350, 0.060, (0.070, 0.012, 
0.072) 0.090) 0.018) 
mga (0.2240, 0.0384, (0.2800, 0.0480, (0.1960, 0.0336, 
0.0576) 0.0720) 0.0504). 


(52) 


We computed mg (:) in the following way: 


mp (A) = (F: t(A),i(A) + (1 — B)t(A) y, f(A) + (1 - BAA - 
y)) = (0.7(0.4),0 + (1 — 0.7)(0.4) (0.4), 0 + (1 — 0.7)0.4(1 — 0.4)) = 
(0.280, 0.048, 0.072). (53) 


Similarly for mg(B) and mg (AUB). 
To compute mg,_(*), we take a = a2 = a3 = 0.8, (54) 


in formulas (8) and (9). 
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Mgq(A) = a: mg(A) = 0.8(0.280, 0.048, 0.072) 
= (0.8(0.280), 0.8(0.048), 0.8(0.072)) 
= (0.2240, 0.0384, 0.0576). (55) 


Similarly mgq(B) = 0.8(0.350, 0.060, 0.090) = 
(0.2800, 0.0480, 0.0720). (56) 


For Mga (AUB) we use formula (9): 


Mgq(AUB) = (tg(AUB) + (1 — a)|tg (A) + tg(B)], ig (AUB) 
+ (1 — a)[ig(A) + ig(B)], 
fg(AUB) + (1 — a)[fp(A) + fe (B)]) 
= (0.070 + (1 — 0.8)[0.280 + 0.350], 0.012 


+ (1 — 0.8)[0.048 + 0.060], 0.018 + (1 — 0.8)[0.072 + 0.090]) 
= (0.1960, 0.0336, 0.0504). 


Again, the reliability discount and importance discount commute. 


8. Conclusion. 


In this paper we have defined a new way of discounting a classical and 
neutrosophic mass with respect to its importance. We have also defined the 


discounting of a neutrosophic source with respect to its reliability. 


In general, the reliability discount and importance discount do not 


commute. But if one uses classical masses, they commute (as in Examples 1 
and 2). 
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DSm Theory for Fusing Highly Conflicting ESM Reports 


Pierre Valin 
Pascal Djiknavorian 
Dominic Grenier 


Abstract - Electronic Support Measures consist of passive 
receivers which can identify emitters coming from a small 
bearing angle, which, in turn, can be related to platforms 
that belong to 3 classes: either Friend, Neutral, or Hostile. 
Decision makers prefer results presented in STANAG 1241 
allegiance form, which adds 2 new classes: Assumed 
Friend, and Suspect. Dezert-Smarandache (DSm) theory is 
particularly suited to this problem, since it allows for 
intersections between the original 3 classes. Results are 
presented showing that the theory can be successfully 
applied to the problem of associating ESM reports to 
established tracks, and its results identify when miss- 
associations have occurred and to what extent. Results are 
also compared to Dempster-Shafer theory which can only 
reason on the original 3 classes. Thus decision makers are 
offered STANAG 1241 allegiance results in a timely 
manner, with quick allegiance change when appropriate 
and stability in allegiance declaration otherwise. 
Keywords: Electronic Support Measures, Dezert- 
Smarandache, Dempster-Shafer, allegiance, fusion. 


1 Introduction 


Electronic Support Measures (ESM) consist of passive 
receivers which can identify emitters coming from a small 
bearing angle, but cannot determine range (although some 
are in development to provide a rough measure of range). 
The detected emitters can be related to platforms that 
belong to 3 classes: either Friend (F=1), Neutral (N=2) or 
Hostile (H=3), heretofore called ESM-allegiance, within 
that bearing angle. 

In the case of dense targets, ESM-allegiance can 
fluctuate wildly due to miss-associations of an ESM report 
to established track. Hence, decision makers would like the 
target platforms to be identified on a more refined basis, 
belonging to 5 classes: Hostile (or Foe), Suspect (S), 
Neutral, Assumed Friend (AF), and Friend, since they 
realize that no fusion algorithm can be perfect and would 
prefer some stability in an allegiance declaration, rather 
than oscillations between extremes. This will heretofore be 
referred to as STANAG 1241 allegiance, or just STANAG 
allegiance for short [1]. 

With this more refined STANAG-allegiance, a 
decision maker would probably take no aggressive action 


Originally published as Valin P., Djiknavorian P., Grenier D., DSm theory 
for fusing highly conflicting ESM reports, in Proc. of Fusion 2009, Seattle, 
WA, USA, 6-9 July 2009, and reprinted with permission. 


against either a friend or an assumed friend (although 
he/she would monitor an assumed friend more closely). 
Similarly a decision maker would probably take aggressive 
action against a foe and send a reconnaissance force (or a 
warning salvo) towards a suspect. Neutral platforms would 
correspond to countries not involved in the current conflict. 

All incoming sensor declarations correspond to a 
frame of discernment of 3 classes, and several theories exist 
to treat a series of such declarations to obtain a fused result 
in the same frame of discernment, like Bayesian reasoning 
and Dempster-Shafer (DS) reasoning [2, 3] (often called 
evidence theory). However, when the output frame of 
discernment is larger that the input frame of discernment, 
an interpretation has to be made as to what this could mean, 
or how that could be generated. This is the subject of the 
next section. 


1.1 


It should be noted that Bayes theory is implemented in a 
very complex form in STANAG 4162 [4], and that DS 
theory is found on board many platforms, such as the 
German F124 frigates [5], the Finnish Fast Attack Craft 
Squadron 2000 [6], and the Light Airborne Multi-Purpose 
System (LAMPS) helicopters of the US Navy [7]. The 
translation from DS to Bayes can be performed via the 
pignistic transformation [8], and the result broadcast via 
tactical data links. 

In all these implementations, the emitter detected is 
first correlated to a platform, and then to an allegiance. 
According to [9], recognition of a platform can range from 
a very rough scale (e.g. combatant/merchant) to a very fine 
one (e.g. name of contact/track), whereas identification 
refers to the assignment of one of the 6 standard STANAG 
1241 identities (for which we adopt the word “allegiance” 
in this report) to a track. The extra identity is “unknown”, 
which we disregard in this report, assuming that all detected 
emitters are identifiable. 

Therefore, this report investigates an alternative 
method of achieving STANAG-allegiances, which does not 
aim to compete with the above implementations, but rather 
can be seen as an expert advisor to the decision maker. 
Since Dezert-Smarandache theory was only developed 
extensively after the publications of the STANAGs, this 
could not have been foreseen by NATO, and is thus worthy 
of experimentation. 


Some solutions 
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1.2 An interpretation of STANAG 1241 


Dezert-Smarandache (DSm) theory can coherently, with 
well-defined fusion rules, lead to an output amongst 5 
classes, even though the input classes number only 3, 
because the theory allows for intersections. For example, 


° “Suspect” might be the result obtained after fusing 
“Hostile” with “Neutral”, and 


° “Assumed Friend” might be the result obtained after 
fusing “Friend’ with “Neutral”. 


This illustrated in the Venn diagram of Figure | below. 





Figure 1. Venn diagram for the STANAG allegiances. 


Note that the set intersection 103 = Ø, the null set, which is 
a constraint in DSm, leading to the use of its hybrid rule. It 
also corresponds to the most likely mission for Canadian 
Forces (CF), namely peace-keeping, or general 
surveillance, when hostile and friendly forces are not likely 
to be located close to each other. 


1.3 Another interpretation of STANAG 1241 


The interpretation in the preceding sub-section is a 
conservative one, namely that there is only one easy way to 
become suspect. This could correspond to a decision maker 
being in a non-threatening situation due to the choice of 
mission, e.g. peace-keeping. There could be situations 
where there is a need for a more aggressive response. In the 
case of a combat mission for example, the appropriate Venn 
diagram might be the one of Figure 2, where there are many 
more ways to become suspect, namely all the intersections 
bordering Hostile. 


Figure 2. Another possible Venn diagram. 


Figure 2 corresponds to a combat situation more 
appropriate for the USA, or to the CF as long as they play 
an active role in the Kandahar region of Afghanistan. The 
situation of Figure 1 will be the one implemented in this 
paper, as it is more in line with CF roles, and also because 
all of the features of DSm theory can be exercised, without 
the additional complexity of keeping all the intersections of 
Figure 1. 


2 Dezert-Smarandache Theory 
2.1 


Since DS theory has been in use for over 40 years, the 
reader is assumed to be familiar with it [2, 3]. DSm theory 
encompasses DS theory as a special case, namely when all 
intersections are null. Both use the language of masses 
assigned to each declaration from a sensor (in our case, the 
ESM sensor). A declaration is a set made up of singletons 
of the frame of discernment O, and all sets that can be made 
from them through unions are allowed (this is referred to as 
the power set 2° of DS theory). In DSm theory, all unions 
and intersections are allowed for a declaration, this forming 
the much larger hyper power set D°. For our special case 
of cardinality 3, © = {0), 02, 63}, with |©| = 3, D® is still of 
manageable size, namely has a cardinality of 19. 

In DST, a combined “fused” mass is obtained by 
combining the previous (presumably the results of previous 
fusion steps) m;(4) with a new m2(B) to obtain a new fused 
result by applying the conjunctive rule 


Formulae for DS and DSm theories 


mı mC) = 'm,(A) m2(B) (1) 


where C = AB, and by re-normalizing by (1-K)” where K 
is the conflict corresponding to the sum of all masses for 
which the set intersection yields the null set. This common 
renormalization is a critical feature of DS theory, and 
allows for it to be associative, whereas a multitude of 
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alternate ways of redistributing the conflict (proposed by 
numerous authors) loses this property. The associativity of 
DST is key when the time tags of the sensor reports are 
unreliable, since associative rules are impervious to a 
different order of reports coming in, but all others rules can 
be extremely sensitive to the order of reports. This is the 
main reason we concentrate only on DS vs. DSm, but 
another reason is the proliferation of alternatives to DS, 
which redistribute the conflict in various fashions (for a 
review, see [10]). 

In DSm theory, a constraint like the one that was 
imposed by Figure 1, namely that 173 = Ø is treated by the 
hybrid DSm rule below: 

m(A) = $A) [ Si(A) + S2(A) + S3(A) ] (2) 
The reader is referred to a series of books [10, 11] on DSm 
theory for lengthy descriptions of the meaning of this 
formula (note that the function ¢ is not to be confused with 
the empty set). A three-step approach is proposed in chapter 
5 of [11], which is used here. 

If the incoming sensor reports are in DS-space: Friend 
(F=1), Neutral (N=2) or Hostile (H=3), then Figure 1 has 
the interpretation in DSm space (allowing intersections 
during the fusion step) of: 


Friend = 10 {= 01002} 
Hostile = 10 3= 03002} 
Assumed Friend = {01002} 
Suspect = {0.703} 
Neutral = 10 27 0^0, — 03002} 

As expected, all STANAG-allegiances (masses assigned to 
the sets mentioned above) sum up to 1, as shown below. 
The left hand side, which is the sum of the masses for all 5 
classes, yields the right hand side, which is unity in DSm 


theory. 


01,- 01002 + 03— 030. t 01702 t 05703 t @-— 0;A02— 
0NA- =0,;+05+ 03 01702 030= 1 (3) 








(since m(0,703) = 0, i.e. 0100; = 103 =Ø by Figure 1). 


2.2 A typical simulation scenario 


In order to compare DS with DSm, one must list the pre- 

requisites that the scenario must address. It must: 

e be able to adequately represent the known ground 
truth 

e contain sufficient miss-associations to be realistic and 
to test the robustness of the theories 

e only provide partial knowledge about the ESM sensor 
declaration, which therefore contains uncertainty 


e beable to show stability under countermeasures, yet 
e be able to switch allegiance when the ground truth 
does so 


The following scenario parameters have therefore 
been chosen accordingly: 

e Ground truth is FRIEND for the first 50 iterations of 
the scenario and HOSTILE for the last 50. 

° the number of correct associations is 80%, 
corresponding to countermeasures appearing 20% of 
the time, in a randomly selected sequence. 

e the ESM declaration has a mass (confidence value in 
Bayesian terms) of 0.7, with the rest (0.3) being 
assigned to the ignorance (the full set of elements, 
namely ©). 

The last 2 bullets of the first list would translate into 

stability for the first 50 iterations and eventual stability for 

the last 50 iterations, after the allegiance switch at iteration 

50. 

This scenario will be the one addressed in the next 
section, while a Monte-Carlo study is described in the 
subsequent section. Each Monte-Carlo run corresponds to 
a different realization using the above scenario parameters, 
but with a different random seed. The scenario chosen is 
depicted in Figure 3 below. 


Hostile} 
Neutral- 


Friend|+rreres + set + dapet + teens bb deep teert ty + 
0 


Figure 3. Chosen scenario. 


The vertical axis represents the allegiance Friend, Neutral, 
or Hostile. Roughly 80% of the time the ESM declares the 
correct allegiance according to ground truth, and the 
remaining 20% is roughly equally split between the other 
two allegiances. There is an allegiance switch at the 50th 
iteration, and the selected randomly selected seed in the 
above generated scenario generates a rather unusual 
sequence of 4 false Friend declarations starting at iteration 
76 (when actually Hostile is the ground truth), which will 
be very challenging for the theories. 


3 Results for the simulated scenario 


Before presenting the results for DS, it should be noted that 
the original form of DS tends to be overly optimistic. 
Given enough evidence concerning an allegiance, it will be 
very hard for it to change allegiances at iteration 50. This is 
a well-known problem, and a well-known ad hoc solution 
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exists [12], and consists in renormalizing after each fusion 
step by giving a value to the complete ignorance which can 
never be below a certain factor (chosen here to be 0.02). 
Comparison will be made with DSm and the Proportional 
Conflict Redistribution (PCR) #5 (PCRS) preferred by 
Dezert and Smarandache [10]. 


3.1 DS results 


The result for DST is shown in Figure 4 below with Friend 
(1), Neutral (2) and Hostile (3). 


09 


08 


06 


os | i} 


toed Need Sede teccesedesssssed Mteteesesseed Be 


Figure 4. DS result for the chosen scenario. 


DS never becomes confused, reaches the ESM-allegiance 
quickly and maintains it until iteration 50. It then reacts 
reasonably rapidly and takes about 6 reports before 
switching allegiance as it should. Furthermore after being 
confused for an iteration around the sequence of 4 Friend 
reports starting at iteration 76, it quickly reverts to the 
correct Hostile status. 

Note that a decision maker could look at this curve 
and see an oscillation pointing to miss-associations without 
being able to clearly distinguish between a miss-association 
with the other two possible allegiances. This fairly quick 
reaction is due to the 0.02 assigned to the ignorance, which 
translates to DS never being more than 98% sure of an 
ESM-allegiance, as can be seen by the curve topping out at 
0.98. Figure 4 shows the mass, which is also the pignistic 
probability for this case, with the latter being normally used 
to make a decision. 


3.2 DSm results 


For the hybrid DSm rule [10], it was suggested to use the 
Generalized Pignistic Probability in order to make a 
decision on a singleton belonging to the input ESM- 
allegiance. This seems to cause problems [13]. Since the 
whole idea behind using DSm was to present the results to 
the decision maker in the STANAG allegiance format, the 
result of Figure 5 would be shown to the decision maker. 
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Figure 5. DSm result for the chosen scenario. 


The decision maker would clearly be informed that miss- 
associations have occurred, since Assumed Friend 
dominates for the first 50 iterations and Suspect for the 
latter 50. DSm is more susceptible to miss-associations 
than DS (the dips are more pronounced), but it has the 
advantage of giving extra information to the decision 
maker, namely that the fusion algorithm is having difficulty 
with associating ESM reports to established tracks. 

Just as for DS, the Friend declarations starting at 
iteration 76 cause confusion, as it should. The change in 
allegiance at iteration 50 is detected nearly as fast as DST. 
What is even more important is that F and AF are clearly 
preferred for the first 50 iterations and S and H for the last 
50, as they should. 


3.3 PCRS5 results 


PCRS5 shows a similar behaviour, but is much less sure of 
what’s going on (the peaks are not as pronounced), as seen 
in Figure 6. Again, F and AF are clearly preferred for the 
first 50 iterations and S and H for the last 50, as they 
should. 
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Figure 6. PCRS result for the chosen scenario. 


3.4 Decision-making threshold 


Because of the occasionally oscillatory nature of some 
combination rules, one has to ask oneself when to make a 
decision or recommend one to the commander. This is 
illustrated in Figure 7 for DS although the same is 
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applicable for all the others. A threshold at a very secure 
90% would result in a longer time for allegiance change, 
and result in a longer period of indecision around iteration 
76, compared to one at 70%. 





Figure 7. Decision thresholds. 


4 Monte-Carlo results 


Although a special case such as the one described in the 
previous section offers valuable insight, one might question 
if the conclusions from that one scenario pass the test of 
multiple Monte-Carlo scenarios. This question is answered 
in this section. 

In order to sample the parameter space in a different 
way, the simulations below correspond to 90% correct 
associations (higher than the previous 80%), an ESM 
confidence at 60% (lower than the previous 70%) and an 
ignorance threshold at 0.02 as before. The number of 
Monte-Carlo runs was set to 100. 


4.1 


The result for DS is shown in Figure 8. As expected, since 
DS reasons over the 3 input classes, Suspect and Assumed 
Friend are not involved. 
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Figure 8. DS result after 100 Monte-Carlo runs. 


Naturally, since Assumed Friend and Suspect do not exist 
in DST, these are calculated as zero. Friend, Neutral, and 
Hostile have the expected behaviour. One sees the same 


response times, after an average over 100 runs, as was seen 
in the selected scenario of the previous section. 


4.2 DSm results 


The similar result for DSm is shown in Figure 9 below. In 
this case, AF dominates for the first 50 iterations, on 
average (over 100 runs) and S for the last 50, confirming 
that the chosen scenario was representative of the behaviour 
of DSm. The response times are similar on average also. 
DSm is slightly less sure (plateau at 70%) than DS (plateau 
at 80%), but this can be adjusted by lowering the decision 
threshold accordingly. 
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Figure 9. DSm result after 100 Monte-Carlo runs. 


4.3 PCR5 results 


Finally, the PCRS result is shown in Figure 10 below. In 
this case also, AF dominates for the first 50 iterations, on 
average (over 100 runs), and S for the last 50, confirming 
that the chosen scenario was representative of the behaviour 
of PCRS. The response times are similar on average also. 
PCRS is slightly less sure (plateau at 60%) than DST 
(plateau at 80%) or DSmT (plateau at 70%). 
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Figure 10. PCRS result after 100 Monte-Carlo runs. 


4.4 Effect of varying the ESM parameters 


In order to study the effects of varying the ESM parameters, 
the simulations below correspond to an ESM confidence at 
80% (higher than the previous 60%) and an ignorance 
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threshold at 0.05 (higher than the 0.02 used previously). 
The number of Monte-Carlo runs was again set to 100. 

A filter was also applied to the input ESM 
declarations over a window of 4 iterations. The filter 
assigns lesser confidence to ESM reports which are not well 
represented in the window. More on this sliding window 
filtering is available in [13]. The idea of such a sliding 
window has also been studied before with good results for a 
variety of reasoning schemes [14]. The results are shown in 
Figure 11 for DS, Figure 12 for DSm and Figure 13 for 
PCRS. From these figures, one can see the smoothing 
effect of the filter, but more importantly the all of the 
conclusions of the previous Monte-Carlo runs, as well as 
the selected scenario of the previous section hold in their 
totality. 
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Figure 12: DSm result after 100 runs and input filter. 
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Figure 13: PCRS result after 100 runs and input filter. 


5 Conclusions 


Because of the nature of ESM which consists of passive 
receivers that can identify emitters coming from a small 
bearing angle, and which, in turn, can be related to 
platforms that belong to 3 classes: either Friend, Neutral, or 
Hostile, and to the fact that decision makers would prefer 
results presented in STANAG 1241 allegiance form, which 
adds 2 new classes: Assumed Friend, and Suspect, Dezert- 
Smarandache theory was used instead, but also compared to 
Dempster-Shafer theory. In Dezert-Smarandache theory an 
intersection of Friend and Neutral can lead to an Assumed 
Friend, and an intersection of Hostile and Neutral can lead 
to a Suspect. 

Results were presented showing that the theory can be 
successfully applied to the problem of associating ESM 
reports to established tracks, confirming the work published 
in [15]. Results are also compared to Dempster-Shafer 
theory which can only reason on the original 3 classes. 
Thus decision makers are offered STANAG 1241 
allegiance results in a timely manner, with quick allegiance 
change when appropriate, and stability in allegiance 
declaration otherwise. 

In more details, results were presented for a typical 
scenario and for Monte-Carlo runs with the same 
conclusions, namely that Dempster-Shafer works well over 
the original 3 classes, if a minimum to the ignorance is 
applied. The same can be said for Dezert-Smarandache 
theory, and to a lesser extent for a popular Proportional 
Conflict Redistribution rule, but with the added benefit that 
Dezert-Smarandache theory identifies when miss- 
associations occur, and to what extent. 

Finally, the effects of varying the input parameters for 
the performance of the ESM were studied, and all of the 
conclusions remain the same. 
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An Application of DSmT in Ontology-Based Fusion Systems 


Ksawery Krenc 


Adam Kawalec 


Abstract — The aim of this paper is to propose an 
ontology framework for preselected sensors due to the 
sensor networks’ needs, regarding a specific task, such as 
the target’s threat recognition. The problem will be 
solved methodologically, taking into account particularly 
non-deterministic nature of functions assigning the 
concept and the relation sets into the concept and relation 
lexicon sets respectively and vice-versa. This may 
effectively enhance the efficiency of the information 
fusion performed in sensor networks. 


Keywords: Attribute information fusion, DSmT, belief 
function, ontologies, sensor networks. 


1 Introduction 


Ontologies of the most applied sensors do not take into 
account needs of sensor networks [1]. Sensors, in 
particular the more complex ones, like radars or sonars are 
intended to be utilized autonomously. 

The foundation of the sensor networks (SN), 
comprehended as the networks of cooperative monitoring, 
is understanding information obtained from some 
elements by another ones. Thus the question of the 
common language is very important. The ontology of 
sensor network should be unified and structured. 

The key problem in this paper is neither a direct 
application of existing solutions in the field of ontologies 
for the sensor networks nor a design of a new ontology, 
ready to implement. The aim is to propose the ontology 
framework for networks, consisting of preselected 
sensors, due to the sensory needs, to perform a specific 
task, such as recognizing the target threat. 

The selection of the sensors will be taken 
in four particular steps, namely: 


1. Describing, what particular pieces of information are 
required to define the target threat; 


2. Describing, what particular sensors enable to gain 
the mentioned pieces of information; 


3. Identification of all information possible to acquire 
by preselected sensors; 


4. The specific sensor selection; 
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2 Sensor type selection 


This section focuses on creating the ontology of a sensor 
network, processing information related to the target 
threat attribute. Mentioned information may be classified, 
according to its origin, as: 


e Observable — originated directly from sensors or 
visual sightings; 


e Deductable (abductable) — designated by the way 
of deductive reasoning, based on the other 
observable attributes, gathered previously; 


e Observable and deductable — designated both: on 
the basis of observation and by the way of 
deductive reasoning; 


e Confirmed — verified by other information center 
or external sensor network; 


The observable attributes may be defined based on 
information originated from diverse sensors. For the 
purpose of this paper the scope of sensors (possible to 
utilize) will be constrained to the set, which in the 
authors’ opinion fully reflects the required information 
about the target in the real world. 

It is a very important assumption that the selection of 
sensor types is conditioned ontologically. That means 
neither any particular sensor model nor communication 
protocol nor any other element of the SN organizations 
will be discussed. 

From the observer’s point of view (whose main duty 
is to assess the target threat) it is important to define the 
following features of the target: 


e Key attribute of the target: the threat (based on 
observations); 


e Additional target attributes (as the basis for 
deduction reasoning about the threat) i.e. the 
platform, (frigate, corvette, destroyer) and the 
activity (attack, reconnaissance, search & 
rescue); 


e Auxiliary characteristics: target position; 
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2.1 


Preselected target features may be registered by various 
means of observation, namely: 


Types of sensors 


e Position: Radar (all spatial dimensions), sonar, 
IR sensor (mostly to define target azimuth and 
elevation); 


e Threat: IFF, visual sightings (human), video 

camera (daylight or noctovision); 
e Platform: visual sightings, video camera, 
thermo-vision camera; 


e Activity: visual sightings. 


The above statement may be regarded as a pre- 
selection of sensor set, used in the following 
considerations of this paper. It is important to notice, that 
some of the mentioned sensors may acquire information 
related to more than one attribute. Therefore, a reversed 
assignment (sensors to attributes) seems to be more 
adequate. 


2.2 Sensor-originated information 


Figure 1 presents the preselected target features and their 
inclusion relations. Additionally, it was pointed out the 
example sensors, which enable to acquire the mentioned 
information. 


Visual 
sightings 








Activity 
e 





Video 
camera 


Figure | Information scope originated from diverse types 
of sensors. 


It should be noted that although some of these 
sources allow for obtaining information on more than one 
attribute, it is possible to identify a hierarchy of relevance 
of this information. That means that some of the 
attributes, however, possible to reveal from multiple 
sources, for some sources perform the primary 
information while for others the secondary information: 
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e Radar: position’ : 


IFF: position, threat; 


Video camera: position, platform, threat; 


e Visual sightings: position, threat, platform, 
activity; 


For visual sightings, where the human plays the role 
of the sensor, it is difficult to identify the primary 
information. Among the above sources the visual 
recognition is the most reliable way of defining the target 
activity. Therefore, taking into account the fact that it 
allows to identify the target threat and platform, the visual 
recognition may be considered as a specific source of 
information. 

These observations are highly important for future 
considerations, which will be effectively used in creation 
of the hierarchy of the concept lexicons as well as in 
defining the relations among concepts of SN ontology. 

Some of these sensors perform very complex 
devices and require the introduction of certain interfaces, 
allowing the automatic acquisition of useful information 
(in terms of sensor networks). An example of such a 
sensor is a video camera. In order to make effective use of 
an image from the video camera a specific module is 
necessary to interpret the taken picture, identifying the 
significant features of the object of interest. In that case, 
the ontology, the video camera is defined in that very 
module and it is modifiable as long as there is access to 
the configuration of that module. This leads to another 
possible classification of sensors: 


e Constant (invariant) ontology sensors, e.g. IFF; 


e Variant ontology sensors, e.g. video camera 
equipped with interpretation module or visual 
sightings; 


Guided by the principle of maximum information 
growth, in next stages of creating the SN ontology the 
following sources of attribute information will be taken 
into account: IFF, video camera (VC) and visual sightings 


(VS). 


3 Defining sets of SN ontologies 


Referring to a taxonomy of the term of ontology [1] the 
authors would like to notice that the problem of SN 
ontology concerns, in particular, the so-called method and 
task ontologies. 

There have been effectively utilized concept 
lexicons of Joint C3 Information Exchange Data Model 


' Underline means the prime information. 
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[2], constraining the considerations to three of the JC3 
model attributes: 


e threat: object-item-hostility-status-code; 
e platform: surface-vessel-type-category-code; 
e activity: action-task-activity-code; 


While defining the attribute relation functions, the Dezert- 
Smarandache Theory (DSmT) of plausible and 
paradoxical reasoning has been utilized [3]. 


3.1 Rules for sensor network ontologies 
selection 


In section 2.2 there was proposed a sensor distinction for 
variant and invariant ontology sensors. Considering this 
division is fundamental while creating SN ontology, 
which takes place in four stages: 


1. Creating the fundamental concept lexicon for a 
sensor network, based on invariant concept 
lexicons of particular sensors; 


2. Creating the auxiliary concept lexicon for sensor 
network, based on variant concept lexicons of 
particular sensors; 


3. Extending the fundamental concept lexicon with 
the auxiliary lexicon; 


4. Defining relations among the concepts in sensor 
network; 


According to the definition of ontology, given in [4], [5], 
SN ontology may be formulated as follows: 


O=(L,F,G,F,G,C,R) (1) 


where: 
L — is either concept or relation lexicon; 

F — lexicon elements to concepts assigning 
function; G — lexicon elements to relations 
assigning function; F-— a function reversed to F, 
assigning concepts to elements of the concept 
lexicon; Gr a function 
reversed to G, assigning relations to elements of the 
relation lexicon; C — a set of 
the whole concepts used in SN; R — a set 
of the whole relations used in SN. 


According to the lexicons of JC3 model, the above 
mentioned concepts and functions will be defined in the 
following subsections. 


3.1.1 Concepts 


Concepts are representations of a certain group of objects 
of the same characteristics, which may be directly 
identified by selected subset of elements of the concept 
lexicon [5]. That means, that assigning for example an 
attribute ‘hostile target’ to a target uses the concept of the 
‘hostile target’, which is the element of the set (C) of all 
possible concepts for a given sensor network. 

Another question is a representation of the concept 
‘hostile target’ in the language of the particular source. 
For instance: for IFF device it will be the value of ‘FOE’, 
and for a video camera the value, defined in the 
interpreting module as ‘HOSTILE’. 

Mathematically, the F assignment is not a bijection 
in general, moreover: it is not a function. In case multiple 
sources are utilized, the F is not an injection, whereas if 
the concept set is ‘rich’, comparably to the ‘poor’ lexicon 
the F is not injective. This may occur if the SN, prepared 
for defining fully target threat, is used for deciding 
whether the target is either friend or hostile. Then, the F 
will interpret concepts of ‘training hostile’, ‘training 
suspect’ and ‘assumed friend’ as ‘friend’ assigning the 
lexical value of ‘FRIEND’ [6]. 

In order to illustrate F and F assignment it is 
suggested to consider the following example. 


Example 1: Let the set of concepts be defined as 


follows: 


C = { friend’, ‘assumed friend’ , ‘assumed hostile’, 
‘hostile ’} (2) 


and the concept lexicon is defined as follows: 
Lc = {FRIEND, HOSTILE, ASSUMED} (3) 


Thus, it is possible to define subsets of the concept 
lexicon elements in such a way that the F assignment 
would be a bijection (Figure 2). 


Ne i ea N 
FRIEND W A ZA friendly target | 


S 


{ ASSUMED FRIEND Ver ~a / assumed friendly \ 
PN 





s F target A 
F 
fae Dos p PE o ae EN 
{ ASSUMED HOSTILE w = a pasmes enemy \ 
à J = arget 
Ne wy - F 
F 
g Ne ue SRY sS 
HOSTILE kg Ca A enemytarget | 
\ P Pe F ae Ne j 


Figure 2 F-assignment as a bijection. 
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Defining subsets of lexical elements as singletons 
leads to non-function F assignment (Figure 3). 
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Figure 3 F as a non-function assignment. 


In case of ‘rich’ concept lexicon sets it is important 
to express subsequent target types as conjunctions of their 
distinctive features. 


Example 2: 


Table 1 Example definitions of surface platforms 


AUX ^ AIR ^ D a TRAN 
AUX a S&MCAL a AIR a 
Command 
C2 
where: 


AUX - auxiliary vessel; 

S&MCAL — equipped with artillery of small and 
medium caliber; 

AIR — against the air targets; 

D — performs landing operations; 

C2 — command & control; 

TRAN - transport of landing forces; 


Transporte 





3.1.2 Relations 


Relations define the relationships among concepts. 
Relation may be hierarchical or structural. Moreover, for 
the purpose of sensor networks, they may be classified as: 


e Relations I, among the observable attributes of a 
diverse type; 


e Relations II, among attributes of miscellaneous 
origin; 


e Relations HI, among the identical attributes, 
originated from diverse sources; 
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Relations among the observable attributes of a 
diverse type enable a deduction of some attributes values 
based on observable values of another ones. For instance: 
the relations between the threat and the platform of the 
target enable the deduction of target activity. Linking the 
subsequent observable attributes is performed according 
to mentioned in previous section distinctive features of the 
target. This means that for example: defining (based on 
observations) the target platform is equal to assigning to 
the target some of distinctive features, which the target, 
performing the particular activity, has to possess. 

Relations among attributes of miscellaneous origin: 
observable and deductable result in so-called observable- 
deductable attribute. The effective information fusion 
from multiple sources is performed according to the rules 
of combination and conditioning, obtained from DSmT 
[7], [8]. This process is going to be described in details in 
section 3.2. 

Relations among the identical attributes, originated 
from diverse sources are the type of relations, where the 
key question is a lexical variety of concepts used by 
particular sources. For instance: the threat attribute value 
acquired from IFF may be either FRIEND or FOE, 
whereas the same attribute obtained from visual sightings 
may be of {FRIEND, HOSTILE, UNKNOWN, JOKER, 
FAKER,...}. In such a case a value of FRIEND, gained 
from IFF, corresponds to the exact value of the visual 
sightings. The value of FOE is equal to HOSTILE, 
whereas the relations among values of FRIEND, gained 
from IFF and FAKER (or JOKER), gained from the 
visual sightings are not so obvious and they must be 
defined, according to the definitions of these training 
types (JOKER, FAKER). 


3.2 Proposition of sensor network ontology 


This section presents a proposition of an ontology 
framework for a sensor network, dedicated to monitor the 
target threat. In the solution there were utilized concepts 
and concept lexicons of JC3 model. The authors’ 
intention was to show the way relations of three attributes 
(threat, platform and activity) should be defined, rather 
than to present the complete SN ontology. 

Table 2 presents a bijective assignment of concepts 
to elements of a concept lexicon. As it was mentioned 
before, this assignment need not be a bijection, however it 
is desirable especially if sets of values for attributes of 
platform and activity are numerous. 


Table 2 SN ontology: concepts and concept lexicon. 


An OBJECT-ITEM that ASSUMED 

is assumed to be a friend FRIEND 
because of its 
characteristics, behavior 
or origin. 


An OBJECT-ITEM that 


object-item- 
hostility-status- 


HOSTILE 
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... according to 
JC3 


AIRCRAFT 
CARRIER, 
GENERAL 
AMBULANCE 
BOAT 


... according to 
JC3 


PATROL, 
MARITIME 


is positively identified as 
enemy. 
...according to JC3 


General designator for 
aircraft/multi-role 
aircraft carrier; 

Craft 40 meters or less 
employed to transport 
and/or 


Platform 


sick/wounded 
medical personnel. 


... according to JC3 


surface-vessel-type- 
category-code 


To fly over an area, 
monitor and, where 
necessary, destroy hostile 
aircraft, as well as 
protect friendly shipping 
in the vicinity of the 
objective area. 

Emplacement or 
deployment of one or 


MINE- 

LAYING 
more mines. 

JC3 


The assignment of relations among attributes to 
relation lexicons (Table 3) is a surjection. In order to 
define the relations among attributes DSmT combining 
and conditioning rules have been applied. The preferred 
tule for conditioning is the rule no. 12. When combining 
evidence, there is a possibility to use many combination 
tules, depending the particular relation. However, for 
simplicity, it is suggested to apply the classic rule of 
combination (DSmC), which has properties of 
commutativity and associativity. 


action-task-activity-code 





Poe 


Table 3 SN ontology: relations and relation lexicon. 


Relations Remarks Relation 
lexicon 


Based on DSmT 
> 
dl. 


) 
cond(.) | Based on DSmT 


distinctive features 


Based on DSmT 
(combination tule 
need not be identical 
with one in Relations 
II) 


ft Based on DSmT 


Below, there have been presented examples of 
particular types of relations. In case of the relation of type 
I it is possible to reason about a value of a certain 
attribute, based on the knowledge about the other ones. 
However, if the unambiguous deduction of the third 
attribute is not possible, due to the majority of possible 
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solutions, an application of abductive reasoning (selection 
of the optimal variant) seems to be justified. 


Relations I: 
(Threat, Platform) > Activity: (FAKER, FRIGATE 
TRAINING) > TRAIN OPERATIONS; 
(Threat, Activity) > Platform: (FAKER, TRAIN 
OPERATIONS) > TRAINING CRAFT; 
(Platform, Activity) > Threat: (HOUSEBOAT, 
PROVIDE CAMPS) > NEUTRAL; 


Relations II: 


FAKER = cond(obs(FAKER) © ded(FAKER) © 
obs(FRIEND)); 


Relations III: 


FAKER = cond(obs(FAKER) © VS(FAKER) © 
IFF(FRIEND)); 


The abductive reasoning process may be systemized by 
application of DSmT, where the selection of the optimal 
value takes place after calculating the basic belief 
assignment. 


Example 3: 
(Threat, Activity) > Platform: (FRIEND, MINE 


HUNTING MARITIME) > 
MINEHUNTER COASTAL (MHC) V 
MINEHUNTER COASTAL WITH DRONE (MHCD) 
V MINEHUNTER GENERAL (MH) V 
MINEHUNTER INSHORE (MHI) V 
MINEHUNTER OCEAN (MHO) 
MINEHUNTER/SWEEPER COASTAL (MHSC) V 
MINEHUNTER/SWEEPER GENERAL (MHS) V 
MINEHUNTER/SWEEPER OCEAN (MHSO) V 
MINEHUNTER/SWEEPER W/DRONE (MHSD) 


V 


Applying DSmT, for each of possible hypothesis a certain 
mass of belief is assigned, e.g.: 
m(MHC) = 0.2, m(MHCD) = 0.3, m(MH) 0.1, 
m(MHI) = 0.1 ,m(MHO) = 0.1, m(MHSC) = 0.05, 
m(MHS) = 0.05, m(MHSO) = 0.05, m(MHSD) 
0.05 


Based on the obtained basic belief assignment (bba) belief 
functions, referring to particular hypotheses, may be 
calculated. In the simplest case, assuming all of the 
hypotheses are exclusive, the subsequent belief functions 
will be equal to respective masses, e.g. Bel(MHC) = 
m(MHC), Bel(MHCD) = m(MHCD), etc. 

More complex case, where relationships among 
hypotheses are taken into account will be considered in 
the next section. 
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4 Verification of the usefulness of 
elaborated ontology sets 


The presented framework of the SN ontology, for the 
purpose of the target threat assessment, requires a 
verification. Particularly, it is important to verify the 
correctness of reasoning processes and a combination of 
the reasoning results with observation information. 

The proposed solution substantially differs from the 
existing deterministic ontology-based methods because it 
introduces explicitly the uncertainty of the relations 
among target attributes. Therefore this section was meant 
to focus on the verification of these relation reasoning 
mechanisms rather than the completeness of the target 
representation by the sensor network. 


4.1 


In order to verify the usefulness of the proposed ontology 
framework, a specially designed demonstrator application 
for evaluation of the target threat information has been 
used. This application enables a simulation of acquiring of 
information from diverse sources, like: radar, video 
camera and visual sightings. 

It is assumed that the visual sighting is also a source 
of information about a target platform and a target 
activity. The bba values for platform and activity 
attributes have been assigned arbitrary. During 
experimentation the observable attributes as well as 
deductable attributes have been taken into account. 
Frames of discernment for observation and deduction may 
differ in general. For the purpose of verification of 
proposed ontology sets, an example from the section 3.2 
is to be considered. Additionally it is assumed: 


Assumptions 


e Application of the hybrid DSmT model: 
o The hypotheses are not exclusive; 


o The hypotheses correspond to the JC3 
model terminology; 


e In relations of type II and III the hybrid rule of 
combination (DSmH) has been applied; 


e The conditioning rule no. 12 has been used for 
updating evidences; 


4.2 Numerical experiments 


Figure 4 shows a randomly generated trajectory of the 
target of which the threat value is at stake. Observations 
are taken from three sources (visual sightings, radar 
system - IFF and video camera) synchronously. 

The green color means successively acquired 
observations for each of the sources. The red color means 
the observations impossible to acquire because the target 
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was outside of the detection zone for a particular source 
[3]. 

Taking for example the last sample, the respective 
bba are as Table 4 shows. 





ve 








Radar 





Figure 4 Randomly generated target trajectory and its 
threat evaluation based on radar, VS and VC observations. 


Table 4 Bba gathered from diverse sources: visual 
sightings, video camera and radar. 
































Threat Pere Pine Radar/IFF 
HOS 0.0024 0.0004 0.0008 
UNK 0.0060 0.0012 - 
NEU 0.0068 0.0015 3 
JOK 0.0109 - : 
FRD 0.2400 0.4368 0.8773 
FAK 0.0292 0.0049 0.0119 
SUS 0.0032 0.0005 0.0011 
AFR 0.0215 0.0046 0.0088 
PEN 0.6800 0.5500 0.1000 




















A relation of type HI of combining information from IFF 
and the visual sightings results in acceptance the target is 
friendly: 


Threat. ® Threat pr = FRIEND (4) 
From the visual sightings it is also acquired that the target 
activity is mine-hunting (MINE HUNTING MARITIME). 
Thus, the relation of type I, between the threat and the 
activity attribute results in selection of the target platform, 
related to searching for mines. 


(FRIEND, MINE HUNTING MARITIME) > platform (5) 
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In the considered case it is assumed the frame of 
discernment of the platform attribute originated from the 
video camera is defined as follows: 


© „c= {MHC, MHI, MHO, MSC, MSO, D} (6) 
where: 

MHC — MINEHUNTER COASTAL; 

MHI — MINEHUNTER INSHORE; 

MHO — MINEHUNTER OCEAN; 

MSC — SWEEPER COASTAL, 

MSO — SWEEPER OCEAN; 

D— DRONE; 


Additionally, with V and O operators the secondary 
hypotheses may be created, which refer to another values 
of the platform attribute (surface-vessel-type-category 


code) of JC3 model: 

MHC’ D = MHCD (MINEHUNTER COASTAL 
WITH DRONE); 

MHIV MHOY MHC’ D = MH (MINEHUNTER 
GENERAL); 


MHOO MSO = MHSO (MINEHUNTER/SWEEPER 
OCEAN); 


(MHC MSC) D = MHSD 
(MINEHUNTER/SWEEPER W/DRONE); 
(MHO™ MSO)V (MHCOMSC)QUD = MHS 


(MINEHUNTER/SWEEPER GENERAL); 


The basic belief assignment for the video camera 
observation may be defined as follows: 

mvc(MHC) = 0.1, mvc (MHCD) = 0.1, 

mvc (MSC) = 0.2, mvc (MHD) = 0.3, 

mvc (MHO) = 0.2, mvc (MSO) = 0.1, 


Due to the implication (5) the above bba may be modified 
according to BCR12 with a following condition: 


Cond : Truth = MHC U MHO Lv MHI (7) 





Figure 5 Venn's diagram for the platform attribute. The 
truth is grey colored. 
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Thus, the resulting bba for the platform attribute is 

updated, as follows: 
mr(MHC|Cond)=mrc(MHC)+mrc(MHCD)=0.2, 
mr(MHSC|Cond)=mvc(MSC)=0. 2, 
mr(MHI|Cond)=mvc(MHD) =0.3, 
mr(MHO|Cond)=mvc(MHO)=0.2, 
mr(MHSO|Cond)=mrvc(MSO)=0. 1, 


which, after calculating the respective belief and 
plausibility functions, leads to acceptation of the 
hypothesis of MHC (MINEHUNTER COASTAL) for the 
platform attribute of the whole sensor network. 

It is worth of notice that the belief function for MHC 
before updating is of the least value since: 

Belvc (MHC) = mvc(MHC) = 0.1 (8) 

After updating, due to the fact that mvc(MHSC) supports 
the belief in MHC hypothesis, this hypothesis becomes 
the most credible since: 


Belr (MHC) = mr(MHC)+ mr(MHSC) = 0.4 (9) 


5 Conclusions 


The results of the numerical experiments, presented in the 
previous section, have proven that the application of 
DSmT for the purpose of defining relations among target 
attributes, gives the possibility of unification of 
information acquired from sensors as well as obtained 
based on the deductive reasoning. That influences 
effectively the whole SN ontology, due to the fact the SN 
concept lexicon becomes substantially modified. It does 
not provide a union of lexicons for each sensor, which 
would be expectable in the deterministic case. The SN 
concept lexicon becomes extended with intersections and 
unions of the hypotheses created upon the lexicons of 
particular sensors. 

During the experiments it has been utilized the JC3 
model’s lexicon of surface-vessel-type-category-code 
attribute. It is important to notice, that despite its large 
volume, the lexicon is not structured. Thus, an emerging 
conclusion occurs, that setting JC3 lexicons in a hierarchy 
would bring tangible benefits due to the fact that the 
hierarchy enables creating the hypotheses using V and 
N operators more effectively, and this in turn increases 
the precision of the reasoning processes based on 
information acquired from sensors. 
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GMTI and IMINT Data Fusion for Multiple 
Target Tracking and Classification 


Benjamin Pannetier 
Jean Dezert 


Abstract — In this paper, we propose a new approach 
to track multiple ground target with GMTI (Ground 
Moving Target Indicator) and IMINT (IMagery INtel- 
ligence) reports. This tracking algorithm takes into ac- 
count road network information and is adapted to the 
out of sequence measurement problem. The scope of 
the paper is to fuse the attribute type information given 
by heterogeneous sensors with DSmT (Dezert Smaran- 
dache Theory) and to introduce the type results in the 
tracking process. We show the ground target tracking 
improvement obtained due to better targets discrimina- 
tion and an efficient conflicting information manage- 
ment on a realistic scenario. 


Keywords: Multiple target tracking, heterogeneous 
data fusion, DSmT. 


1 Introduction 


Data fusion for ground battlefield surveillance is more 
and more strategic in order to create the situational as- 
sessment or improve the precision of fire control system. 
The challenge of data fusion for the theatre surveillance 
operation is to know where the targets are, how they 
evolve (manoeuvres, group formations,...) and what 
are their identities. 

For the first two questions, we develop new ground 
target tracking algorithms adapted to GMTI (Ground 
Moving Target Indicator) sensors. In fact, GMTI sen- 
sors are able to cover a large surveillance area during 
few hours or more if several sensors exists. However, 
ground target tracking algorithms are used in a com- 
plex environment due to the high traffic density and 
the false alarms that generate a significant data quan- 
tity, the terrain topography which can provocate occlu- 
sion areas for the sensor and the high maneuvrability of 
the ground targets which yields to the data association 
problem. Several references exist for the MGT (Multi- 
ple Ground Tracking) with GMTI sensors [1, 2] whose 
fuse contextual informations with MTI reports. The 
main results are the improvement of the track precision 


Originally published as Pannetier B., Dezert J., GMTI 
and IMINT data fusion for multiple target tracking and 
classification, in Proc. of Fusion 2009, Seattle, WA, USA, 
6-9 July 2009, and reprinted with permission. 


and track continuity. Our algorithm [6] is built with 
several reflexions inspired of this literature. Based on 
road segment positions, dynamic motion models under 
road constraint are built and an optimized projection 
of the estimated target states is proposed to keep the 
track on the road. A VS-IMM (Variable Structure In- 
teracting Multiple Models) filter is created with a set of 
constrained models to deal with the target maneuvers 
on the road. The set of models used in the variable 
structure is adjusted sequentially according to target 
positions and to the road network topology. 


Now, we extended the MGT with several sensors. In 
this paper, we first consider the centralized fusion be- 
tween GMTI and IMINT (IMagery INTelligence) sen- 
sors reports. The first problem of the data fusion 
with several sensors is the data registration in order 
to work in the same geographic and time referentials. 
This point is not presented in this paper. However, 
in a multisensor system, measurements can arrive out 
of sequence. Following Bar-Shalom and Chen’s works 
[3], the VS-IMMC (VS-IMM Constrained) algorithm 
is adapted to the OOSM (Out Of Sequence Measure- 
ment) problem, in order to avoid the reprocessing of 
entire sequence of measurements. The VS-IMMC is 
also extended in a multiple target context and inte- 
grated in a SB-MHT (Structured Branching - Multiple 
Hypotheses Tracking). Despite of the resulting track 
continuity improvement for the VS-IMMC SB-MHT al- 
gorithm, unavoidable association ambiguities arise in a 
multi-target context when several targets move in close 
formation (crossing and passing). The associations be- 
tween all constrained predicted states are compromised 
if we use only the observed locations as measurements. 
The weakness of this algorithm is due to the lack of 
good target state discrimination. 


One way to enhance data associations is to use the 
reports classification attribute. In our previous work 
[5], the classification information of the MTI segments 
has been introduced in the target tracking process. The 
idea was to maintain aside each target track a set of ID 
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hypotheses. Their committed belief are revised in real 
time with the classifier decision through a very recent 
and efficient fusion rule called proportional conflict re- 
distribution (PCR). In this paper, in addition to the 
measurement location fusion, a study is carried out to 
fuse MTI classification type with image classification 
type associated to each report. The attribute type of 
the image sensors belongs to a different and better clas- 
sification than the MTI sensors. The counterpart is the 
short coverage of image sensors that brings about a low 
data quantity. In section 2, the motion and measure- 
ment models are presented with a new ontologic model 
in order to place the different classification frames in 
the same frame of discernment. After the VS-IMMC 
description given in section 3, the PCR fusion rule orig- 
inally developed in DSmT (Dezert-Smarandache The- 
ory) framework is presented in section 4 to fuse the 
target type information available and to include the re- 
sulting fused target ID into the tracking process. The 
last part of this paper is devoted to simulation results 
for a multiple target tracking scenario within a real en- 
vironment. 


2 Motion & observation models 


2.1 GIS description 


The GIS (Geographical Information System) used in 
this work contains both the segmented road network 
and the DTED (Digital Terrain Elevation Data). Each 
road segment expressed in WGS84 is converted in a 
Topographic Coordinate Frame (denoted TCF). The 
TCF is defined according to the origin O in such a 
way that the axes X, Y and Z are respectively oriented 
towards the local East, North and Up directions. The 
target tracking process is carried out in the TCF. 


2.2 Constrained motion model 


The target state at the current time tẹ is defined in 
the local horizontal plane of the TCF: 


x(k) = (x(k) i(k) yk) yk) (1) 


where (x(k), y(k)) and (i(k), y(k)) define respectively 
the target location and velocity in the local horizon- 
tal plane. The dynamics of the target evolving on the 
road are modelized by a first-order differential system. 
The target state on the road segment s is defined by 
Xs(k) where the target position (rs(k), ys(k)) belongs 
to the road segment s and the corresponding heading 
(%s(k), ¥s(k)) is in its direction. 

The event that the target is on road segment s is 
noted es(k) = {x(k) € s}. Given the event es(k) and 
according to a motion model M;, the estimation of the 
target state can be improved by considering the road 
segment s. It follows: 


Xs(k) = Fsi(A(k)) -xs(k — 1) + T(A(K)) - vsa(k) (2) 


where A(k) is the sampling time, Fs, is the state tran- 
sition matrix associated to the road segment s and 
adapted to a motion model Mj, vs,i(k) is a white Gaus- 
sian random vector with covariance matrix Q, ;(k) cho- 
sen in such a way that the standard deviation along the 
road segment is higher than the standard deviation in 
the orthogonal direction. It is defined by: 


= oi 0) R 
Qs,i(k) = Ro, - K >) “Ro, (3) 
On 
where Rg, is the rotation matrix associated with the 
direction 6, defined in the plane (O, X,Y) of the road 
segment s. The matrix I'(A;,) is defined in [8]. 

To improve the modeling for targets moving on a 
road network, we proposed in [5] to adapt the level of 
the dynamic model’s noise based on the length of the 
road segment s. The idea is to increase the standard 
deviation on defined in (3) to take into account the 
error on the road segment location. After the state 
estimation obtained by a Kalman filter, the estimated 
state is then projected according to the road constraint 
es(k). This process is detailed in [6]. 


2.3 GMTI measurement model 


According to the NATO GMTI format [7], the MTI 
reports received at the fusion station are expressed in 
the WGS84 coordinates system. The MTI reports must 
be converted in the TCF. A MTI measurement z at the 
current time tẹ is given in the TCF by: 


(4) 


where (x(k), y(k)) is the location of the MTI report 
in the local frame (O, X,Y) and p(k) is the associated 
range measurement expressed by: 


ple (x(k) — e(k)) - &(k) + (y(k) — ye(k)) - (k) 
R= 2A = HOF 
where (a¢(k), yc(k)) is the sensor location at the cur- 
rent time in the TCF. Because the range radial velocity 
is correlated to the MTI location components, the use 
of an extended Kalman filter (EKF) is not adapted. 
In the literature, several techniques exist to uncorre- 
late the range radial velocity from the location com- 
ponents. We prefer to use the AEKF (Alternative Ex- 
tended Kalman Filter) proposed by Bizup and Brown 
in [9], because the implementation is easier by using 
the alternative lienarization than another algorithms to 
decorrelate the components. Moreover, AEKF work- 
ing in the sensor referential/frame remains invariant by 
translation. The AEKF measurement equation is given 

by: 

zuri(k) =Hyrr(k)-x(k)+warrlk) (6) 


where wyrr(k) is a zero-mean white Gaussian noise 
vector with a covariance Ryrr(k) (given in [5]) and 
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Huyrr(k) is defined by: 


1 0 0 0 
Huri(k)={ 9 0 1 0 (7) 
0 Op(k) 0 Op(k) 
Ox ay 








Each MTI report is characterized both with the lo- 
cation and velocity information and also with the at- 
tribute information and its probability that it is correct. 
We denote Cyyrr the frame of discernment on target ID 
based on MTI data. Cmrz is assumed to be constant 
over the time and consists in a finite set of exhaustive 
and exclusive elements representing the possible states 
of the target classification. In this paper, we consider 
only 3 elements in Cyr; defined as: 


cı Ê Tracked vehicle 
Cutt =< co Wheeled vehicle (8) 
c3 £ Rotary wing aircraft 


We consider also the probabilities P{c(k)} (Vc(k) € 
Curr) as input parameters of our tracking systems 
characterizing the global performances of the classifier. 
The vector of probabilities [P(c1) P(c2) P(cs)] repre- 
sents the diagonal of the confusion matrix of the clas- 
sification algorithm assumed to be used. Let 24,77(k) 
the extended MTI measurements including both kine- 
matic part and attribute part expressed by te herein 
formula: 


Zurr(k) = {zaurr(k), c(k), P{e(k)}} (9) 


2.4 IMINT motion model 


For the imagery intelligence (IMINT), we consider 
two sensor types : a video EO/IR sensor carried by 
a Unanimed Aerial Vehicule (UAV) and a EO sensor 
fixed on a Unattended Ground Sensor (UGS). 

We assume that the video information given by both 
sensor types are processed by their own ground sta- 
tions and that the system provides the video reports 
of target detections with their classification attributes. 
Moreover, a human operator selects targets on a movie 
frame and is able to choose its attribute with a HMI 
(Human Machine Interface). In addition, the opera- 
tor is able with the UAV to select several targets on a 
frame. On the contrary, the operator selects only one 
target with the frames given by the UGS. There is no 
false alarm and a target cannot be detected by the op- 
erator (due to terrain mask for example). The video 
report on the movie frame is converted in the TCF. 
The measurement equation is given by: 


Zvideo(k) = Hyideo(k) j x(k) + Wvideo(k) (10) 


where Hy ideo is the observation matrix of the video sen- 


sor 
1 0 0 0 
Hyideo a G 0 1 I (11) 


The white noise Gaussian process Wyideo(k) is centered 
and has a known covariance Ryideo(k) given by the 
ground station. 

Each video report is associated to the attribute in- 
formation c(k) with its probability P{c(k)} that it is 
correct. We denote Cyideo the frame of discernment 
for an EO/IR source. As Cyrrr, Cvideo is assumed to 
be constant over the time and consists in a finite set of 
exhaustive and exclusive elements representing the pos- 
sible states of the target classification. In this paper, 
we consider only eight elements in Cyideo as follows: 


civilian car 
military armoured car 
wheeled armoured vehicule 
civilian bus 


Crideo = military bus (12) 
civilian truck 
military armoured truck 
copter 
Let Zž;deo(k) be the extended video measurements 


including both kinematic part and attribute part ex- 
pressed by the following formula (Vc(k) € Coideo): 
Zrideo(k) = {Zvideo(k), c(k), P{e(k)}} (18) 
For notation convenience, the measurements se- 
quence Z*! represents a possible set of measurements 
generated by the target up to time k (i.e., there ex- 
ists a subsequence n and a measurement i such that 
Zi! = {Z5 zž (k)}) associated with the track 
T*!', At the current time k, the track T*! is represented 
by a sequence of the state estimates. z% (k) is the ye 
measurement available at time k among m(k) validated 
measurements around the target measurement predic- 
tion. 


3 Tracking with road constraints 
3.1 VS IMM with a road network 


The IMM is an algorithm for combining state esti- 
mates arising from multiple filter models to get a better 
global state estimate when the target is under maneu- 
vers. In section 2.2, a constrained motion model 7 to 
a road segment s, noted M?(k), was defined. Here we 
extend the segment constraint to the different dynamic 
models (among a set of r + 1 motion models) that a 
target can follow. The model indexed by r = 0 is the 
stop model. It is evident that when the target moves 
from one segment to the next, the set of dynamic mod- 
els changes. In a conventionnal IMM estimator [1], the 
likelihood function of a model i = 0,1,...,r is given, 
for a track T”', associated with the j-th measurement, 
j € {0,1,...,m(k)} by: 


A; (k) = p{z;(k)|M; (k), Z1" } (14) 
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where Z'—1” is the subsequence of measurements as- 


sociated with the track T*". 

Using the IMM estimator with a stop motion model, 
the likelihood function of the moving target mode for 
i=1,...,r and for j € {0,1,...,m(k)} is given by: 

Ai(k) = Pp- p{zj(k)|M3(k), ZY" } - (1 — bij.) 
+(1 — Pp) arent (15) 


while the likelihood of the stopped target mode (i.e. 
r = 0) is: 

Ag(k) = p{z;(k)| Mi (k), ZE} = dm;,0 (16) 
where Pp is the sensor detection probability, dm,,0 is 
the Kronecker function defined by ôm;,o = 1 if mj = 0 
and 6m,,0 = 0 whenever mj Æ 0. 

The combined/global likelihood function A!(k) of a 
track including a stop model is then given by: 


A'(k) = X Aa(h) - pall = 1) (17) 
i=0 


where ui(k|k — 1) is the predicted model probabilities 
[8]. 

The steps of the IMM under road segment s con- 
straint are the same as for the classical IMM as de- 
scribed in [8]. 

In real application, the predicted state could also 
appear onto another road segment, because of a road 
turn for example, and we need to introduce new con- 
strained motion models. In such case, we activate the 
most probable road segments sets depending on the lo- 
cal predicted state # ,(k|k — 1) location of the track 
T*[5, 1]. We consider r +1 oriented graphs which de- 
pend on the road network topology. For each graph i, 
i=0,1,...,r, each node is a constrained motion model 
M:. The nodes are connected to each other according 
to the road network configuration and one has a finite 
set of r + 1 motion models constrained to a road sec- 
tion. The selection of the most probable motion model 
set, to estimate the road section on which the target 
is moving on, is based on a sequential probability ratio 
test (SPRT). 


3.2 OOSM algorithm 


The data fusion that operates in a centralized ar- 
chitecture suffers of delayed measurement due to com- 
munication data links, time algorithms execution, data 
quantity,...In order to avoid reordering and reprocess- 
ing an entire sequence of measurements for real-time 
application, the delayed measurements are processed as 
out-of-sequence measurements (OOSM). The algorithm 
used in this work is described in [3]. In addition, ac- 
cording to the road network constraint, the state retro- 
diction step is done on the road. 


3.3 Multiple target tracking 


For the MGT problem, we use the SB-MHT (Struc- 
tured Branching Multiple Hypotheses Tracking) pre- 
sented in [10]. When the new measurements set Z(k) 
is received, a standard gating procedure is applied in 
order to validate MTI reports to track pairings. The 
existing tracks are updated with VS-IMMC and the 
extrapolated and confirmed tracks are formed. More 
details can be found in chapter 16 of [10]. In order to 
palliate the association problem, we need a probabilis- 
tic expression for the evaluation of the track formation 
hypotheses that includes all aspects of the data associ- 
ation problem. It is convenient to use the log-likelihood 
ratio (LLR) or a track score of a track TE! which can 
be expressed at current time k in the following recursive 
form: 





L'(k) = L° (k — 1) + AL (k) (18) 

with 
AL! (k) = log S) (19) 

and 
L(0) = log >) (20) 


where Afa and Anz are respectively the false alarm rate 
and the new target rate per unit of surveillance volume 
and A? (k) is the likelihood given in (17). 


4 Target type tracking 


In [4], Blasch and Kahler fused identification at- 
tribute given by EO/IR sensors with position measure- 
ment. The fusion was used in the validation gate pro- 
cess to select only the measurement according to the 
usual kinematic criterion and the belief on the identi- 
fication attribute. Our approach is different since one 
uses the belief on the identification attribute to revise 
the LLR with the posterior pignistic probability on the 
target type. We recall briefly the Target Type Tracking 
(TTT) principle and explain how to improve VS-IMMC 
SB-MHT with target ID information. TTT is based 
on the sequential combination (fusion) of the predicted 
belief of the type of the track with the current ” belief 
measurement” obtained from the target classifier deci- 
sion. Results depends on the quality of the classifier 
characterized by its confusion matrix (assumed to be 
known at least partially as specified by STANAG). The 
adopted combination rule is the so-called Proportional 
Conflict Redistribution rule no 5 (PCR5) developed in 
the DSmT (Dezert Smarandache Theory) framework 
since it deals efficiently with (potentially high) conflict- 
ing information. A detailed presentation with examples 
can be found in [12, 11]. This choice is motivated in this 
typical application because in dense traffic scenarios, 
the VS-IMMC SB-MHT only based on kinematic infor- 
mation can be deficient during maneuvers and cross- 
roads. Let’s recall first what the PCR5 fusion rule 
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is and then briefly the principle of the (single-sensor 
based) Target Type Tracker. 


4.1 PCR5 combination rule 


Let Crot = {01,..., 0n} be a discrete finite set of 
n exhaustive elements and two distinct bodies of evi- 
dence providing basic belief assignments (bba’s) mj (.) 
and mə2(.) defined on the power-set! of Crot. The 
idea behind the Proportional Conflict Redistribution 
(PCR) rules [11] is to transfer (total or partial) con- 
flicting masses of belief to non-empty sets involved in 
the conflicts proportionally with respect to the masses 
assigned to them by sources. The way the conflicting 
mass is redistributed yields actually several versions of 
PCR rules, but PCR5 (i.e. PCR rule # 5) does the 
most exact redistribution of conflicting mass to non- 
empty sets following the logic of the conjunctive rule 
and is well adapted for a sequential fusion. It does a 
better redistribution of the conflicting mass than other 
rules since it goes backwards on the tracks of the con- 
junctive rule and redistributes the conflicting mass only 
to the sets involved in the conflict and proportionally 
to their masses put in the conflict. The PCR5 formula 
for s > 2 sources is given in [11]. For the combination 
of only two sources (useful for sequential fusion in our 
application) when working with Shafer’s model, it is 
given by mpcors(0) = 0 and VX € 207+ \ {Ø} 


mpors(X) = mi2(X)+ 


ma(X)?m2(Y) m(X)?mi(Y) 
pum) a 
ee mı(X)+m(Y) m(X)+m (Y) 
XNY=0 


(21) 


where m42(X) corresponds to the conjunctive consen- 
sus on X between the two sources (i.e. our a prior bba 
on target ID available at time k — 1 and our current 
observed bba on target ID at time k) and where all de- 
nominators are different from zero. If a denominator is 
zero, that fraction is discarded. 


4.2 Principle of the target type tracker 


To estimate the true target type type(k) at time k 
from the sequence of declarations c(1), c(2), ...c(k) 
done by the unreliable classifier? up to time k. To build 
an estimator type(k) of type(k), we use the general prin- 
ciple of the Target Type Tracker (TTT) developed in 
[12] which consists in the following steps: 


e a) Initialization step (i.e. k = 0). Select the tar- 
get type frame Cro: = {01,..., 0n} and set the 


1In our GMTI-MTT applications, we will assume Shafer’s 
model for the frame CTot of target ID which means that ele- 
ments of CTot are assumed truly exclusive. 

?Here we consider only one source of information/classifier, 
say based either on the EO/IR sensor, or on a video sensor by 
example. The multi-source case is discussed in section 4.3. 
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prior bba m~ (.) as vacuous belief assignment, i.e 
m~ (0, U...U6n) = 1 since one has no information 
about the first observed target type. 


e b) Generation of the current bba moys(.) from 
the current classifier declaration c(k) based on 
attribute measurement. At this step, one takes 
Movs(c(k)) = P{e(k)} = Coce)e(r) and all the unas- 
signed mass 1 — mgps(c(k)) is then committed to 
total ignorance 0,U...U@n. Celk)e(k) İs the element 
of the known confusion matrix C of the classifier 
indexed by c(k)c(k). 


e c) Combination of current bba mops(.) with prior 
bba m7 (.) to get the estimation of the current bba 
m/(.). Symbolically we write the generic fusion op- 
erator as ©, so that m(.) = [mos @ m7|(.) = 
[M © Mobs](.). The combination ® is done ac- 
cording to the PCR5 rule (i.e. m(.) = mpcrs(.)). 


e d) Estimation of True Target Type is obtained 
from m(.) by taking the singleton of ©, ie. a 
Target Type, having the maximum of belief (or 
eventually the maximum Pignistic Probability). 


type(k) = argmaa(BetP{A}) 
A€C rot 


(22) 


The Pignistic Probability is used to estimate the 

probability to obtain the type 0; E€ Cro: given the 

previous target type estimate type(k — 1). 
BetP{0;} = P{type(k) = Oi|type(k — 1)} (23) 

e e) set m (.) = m(.); do k = k +1 and go back to 
step b). 


Naturally, in order to revise the LLR in our GMTI- 
MTT systems for taking into account the estimation 
of belief of target ID coming from the Target Type 
Trackers, we transform the resulting bba m(.) = [n7 © 
Mobs|(.) available at each time k into a probability mea- 
sure. In this work, we use the classical pignistic trans- 
formation defined by [13]: 


IX Al 
|X| 





BetP{A}= > m(X) (24) 


X€2CTot 


4.3 Working with multiple sensors 


Since in our application, we work with different sen- 
sors (i.e. MTI and Video EO/IR sensors), one has to 
deal with the discernment frames Cyyrr and Cyideo de- 
fined in section 2. Therefore we need to adapt the 
(single-sensor based) TTT to the multi-sensor case. We 
first adapt the frame Cyyry to Cyideo and then, we ex- 
tend the principle of TTT to combine multiple bba’s 
(typically here mM7‘(.) and m¥,4¢°(.)) with prior tar- 
get ID bba m7(.) to get finally the updated global 
bba m(.) at each time k. The proposed approach can 
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be theroretically extended to any number of sensors. 
When no information is available from a given sensor, 
we take as related bba the vacuous mass of belief which 
represents the total ignorant source because this doesn’t 
change the result of the fusion rule [11] (which is a good 
property to satisfy). For mapping Curry to Cyideo, we 
use a (human refinement) process such that each ele- 
ment of Cmrz can be associated at least to one element 
of Cyideo. In this work, the delay on the the type in- 
formation provided by the video sensor is not taking 
into account to update the global bba m/(.). All type 
information (delayed or not provided by MTI and video 
sensors) are considered as bba moys(.) available for the 
current update. The explicit introduction of delay of 
the out of sequence video information is under investi- 
gations. 


4.4 Data attributes in the VS IMMC 


To improve the target tracking process, the introduc- 
tion of the target type probability is done in the like- 
lihood calculation. For this, we consider the measure- 
ment z} (k)(Vj € {1,...,mx}) described in (9) and (13). 
With the assumption that the kinematic and classifica- 
tion observations are independant, it is easy to prove 
that the new combined likelihood Aly associated with 
a track T*! is the product of the kinematic likelihood 
(17) with the classification probability in the manner 
that: 

Aly (K) = A'(k) - P{type(k)|type(k — 1)} (25) 
where the the probability P {type(k)|\type( = 1)} is 
chosen as the pignistic probability value on the declared 
target type type(k) given type(k — 1) derived from the 
updated mass of belief m(.) according to our target type 
tracker. 


5 Simulations and results 


5.1 Scenario description 


To evaluate the performances of the VS-IMMC SB- 
MHT with the attribute type information, we consider 
10 maneuvering (acceleration, deceleration, stop) tar- 
gets on a real road network. The 10 target types are 
given by (12). The target 1 is passing the military ve- 
hicules 2, 3, 4 and 7. Targets 2, 3, 4 and 7 start from 
the same starting point.The target 2 is passing the ve- 
hicules 3 and 7 in the manner that it places in front of 
the convoy. The targets 5, 6, 9 and 10 are civilian vehi- 
cles and are crossing the targets 1, 2, 3 and 7 at several 
junctions. The goal of this simulation is to reduce the 
association complexity by taking into account the road 
network topology and the attribute types given by het- 
erogeneous sensors. In this scenario, we consider one 
GMTI sensor located at (—50km,—60km) at 4000m 
in elevation and one UAV located at (—100m,—100m) 
at 1200m in elevation and 5 UGS distributed on the 


ground. The GMTI sensor tracks the 10 targets at ev- 
ery 10 seconds with 20m, 0.0008rad and 1m-s~! range, 
cross-range and range-rate measurements standard de- 
viation respectively. The detection probability Pp is 
equal to 0.9 and the MDV (Minimal Detectable Veloc- 
ity) fixed at lm-s~+. The false alarms density is fixed 
(Afa = 1078). The confusion matrix described in part 
4.2 is given by: 

Curr = diag({ 0.8 0.7 0.9 ]) (26) 
This confusion matrix is only used to simulate the tar- 
get type probability of the GMTI sensor. The data 
obtained by UAV are given at 10 seconds with 10m 
standard deviation in X an Y direction from the TCF. 
The time delay of the video data is constant and equal 
to 11 seconds. The detection probability Pp is equal to 
0.9. The human operator only selects for each video re- 
port a type defined by (12). In our simulations, the tar- 
get type probability depends on the sensor resolution. 
For this, we consider the volume Vyideo of the sensor 
area surveillance on the ground. The diagonal terms of 
the confusion matrix Cyideo are equal to P{c(k)} where 
P{c(k)} is defined by: 


0.90 if Video < 10°m? 


P{c(k)} = < 0.75 if 10m? < Video < 108m? 
0.50 if Video > 108m? 


(27) 


For the UGS, the target detection is done if only 
the target is located under the minimal range detection 
(MRD). The MRD is fixed for the 5 UGS at 1000 m and 
each sensor gives delayed measurement every seconds. 
The time delay is also equal to 11 seconds. The UGS 
specificity is to give only one target detection during 
4 seconds in order to detect another target. We recall 
that there is no false alarms for this sensor. Based on 
[4], the target type probability depends on a (i.e. the 
target orientation towards the UGS). The more the tar- 
get orientation is orthogonal to the sensor line of sight, 
the more the target type probability increases. The di- 
agonal terms of the confusion matrix Cygs are equal 
to P{c(k)} where P{c(k)} is defined by: 


0.90 if#%<a<t 


0.50 28) 


P{e(k)} = 


otherwise 


For each detected target, a uniform random number 
u ~ U([0,1]) is drawn. If u is greater than the true 
target type probability of the confusion matrix, a wrong 
target type is declared for the ID report and used with 
its associated target type probability. The targets are 
scanned at different times by the sensors (figure 1). 


5.2 Filter parameters 


We consider three motion models (i.e. i € {0,1,2}) 
which are respectively a stop model Mp when the target 
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Figure 1: Target’s sensor illumination. 


is assumed to have a zero velocity, a constant velocity 
model Mı with a low uncertainty, and a constant ve- 
locity model Mə with a high uncertainty (modeled by 
a strong noise). The parameters of the IMM are the 
following: for the motion model Mj, the standard de- 
viation along and orthogonal to the road segment are 
equals to 0.05 m-s7?), the constrained constant velocity 
model Mə has a high standard deviation to adapt the 
dynamics to the target manoeuvre (the standard de- 
viation along and orthogonal to the road segment are 
respectively equal to 0.8 m - s7? and 0.4 m-s~?) and 
the stop motion model Mo has a standard deviation 
equals to zero. These constrained motion models are 
however adapted to follow the road network topology. 
The transition matrix and the SB-MHT parameters are 
those taken in [5]. 


5.3 Results 


For each confirmed track given by the VS-IMMC SB- 
MHT, a test is used to associate a track to the most 
probable target. The target tracking goal is to track as 
long as possible the target with one track. To evaluate 
the track maintenance, we use the track length ratio 
criterion, the averaged root mean square error (noted 
ARMSE) for each target and the track purity and the 
type purity (only for the tracks obtained with PCRS) 
[5]. These measures of performances are averaged on 
50 Monte-Carlo runs. 

On figure 2, one sees that the track length ratio be- 
comes better with the PCR5 than without as expected 
for the target 6. When the targets 1 and 2 are passing 
the targets 3, 4 and 7, an association ambiguity arises 
to associate the tracks with the correct measurements. 
This is due to the close formation between targets with 
the GMTI sensor resolution and the road network con- 
figureation with junctions. Sometimes tracks are lost 
with the VS IMMC SB-MHT without the PCR5. Then 
new tracks for each targets are built. That is why, the 
track purity of the VS IMMC SB-MHT without PCR5 
(Table 1) is smaller than the the track purity with 
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Figure 2: Track length ratio. 
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Table 2: Tracking results (VSIMMC and PCR5). 


PCR5 (Table 2). So, the track precision, given by the 
ARMSE criterion, is better with the PCR5. For the tar- 
get 6 results, this target is only scanned by the GMTI 
sensor and its associated performances are equivalent 
for both algorithms. Then, if there is no IMINT infor- 
mation and no interaction between targets, the perfor- 
mances of the algorithm with PCR5 are the same than 
without PCR5. 

Despite of the PCR5 improvement on the target 
tracking, the difference of performances between the al- 
gorithms is not significant. If there is an interaction be- 
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tween IMINT and GMTI information, we can see a gain 
on the track length ratio or track purity of 10% with 
PCR5. This small difference is due to the good con- 
strained state estimation. The estimated target states 
have a good precision because the target tracking is 
done by taking into account the road segments location 
and the good performances of the OOSM approach. So, 
it implies a substantial improvement of the target-to- 
track association. In addition, on Table 2, the type 
purity based on PCR5 is derived from the maximum of 
BetP criterion. But BetP is computed according the 
set Cyideo (12) and if the track receives only MTI re- 
ports the choice on the target type is arbitrary for the 
tracked vehicles of Cmrz (8). In fact, a tracked vehicle 
can be 6 elements of (12). So the probability BetP on 
the 6 tracked vehicles of (12) is equivalent. The selec- 
tion of the maximum of BetP has no meaning because 
in such case and the maximum becomes arbitrary. This 
explains the bad track purity of targets 6 and 9. 


6 Conclusion 


In this paper, we have presented a new approach to 
improve VS IMMC SB-MHT by introducing the data 
fusion with several heterogeneous sensors. Starting 
from a centralized architecture, the MTI and IMINT 
reports are fused by taking into account the road net- 
work information and the OOSM algorithm for delayed 
measurements. The VS IMMC SB-MHT is enlarged by 
introducing in the data association process the type in- 
formation defined in the STANAG 4607 and an IMINT 
attribute set. The estimation of the Target ID proba- 
bility is done from the updated/current attribute mass 
of belief using the Proportional Conflict Redistribution 
rule no. 5 developed in DSmT framework and accord- 
ing to the Target Type Tracker (TTT) recently devel- 
oped by the authors. The Target ID probability once 
obtained is then introduced in the track score compu- 
tation in order to improve the likelihoods of each data 
association hypothesis of the SB-MHT. Our prelimi- 
nary results show an improvement of the performances 
of the VS-IMMC SB-MHT when the type information 
is processed by our PCR5-based Target Type Tracker. 
In this work, we did not distinguish undelayed from 
delayed sensor reports in the TTT update. This prob- 
lem is under investigations and offers new perspectives 
to find a solution for dealing efficiently with the time 
delay of the information type data and to improve per- 
formances. One simple solution would be to use a for- 
getting factor of the delayed type information but other 
solutions seem also possible to explore and need to be 
evaluated. Some works need also to be done to use the 
operational ontologic APP-6A for the heterogeneous 
type information. Actually, the frame of the IMINT 
type information is bigger than the one used in this pa- 
per and the IMINT type information can be given at 
different granularity levels. As a third perspective, we 


envisage to use both the type and contextual informa- 
tion in order to recognize the tracks losts in the terrain 
masks which represent the possible target occultations 
due to the terrain topography in real environments. 
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Threat assessment of a possible Vehicle-Born 
Improvised Explosive Device using DSmT 


Jean Dezert 
Florentin Smarandache 


Abstract — This paper presents the solution about the 
threat of a VBIED (Vehicle-Born Improvised Explosive 
Device) obtained with the DSmT (Dezert-Smarandache 
Theory). This problem has been proposed recently to the 
authors by Simon Maskell and John Lavery as a typi- 
cal illustrative example to try to compare the different 
approaches for dealing with uncertainty for decision- 
making support. The purpose of this paper is to show 
in details how a solid justified solution can be obtained 
from DSmT approach and its fusion rules thanks to a 
proper modeling of the belief functions involved in this 
problem. 


Keywords: Security, Decison-making support, Infor- 
mation fusion, DSmT, Threat assessment. 


1 The VBIED problem 


e Concern: VBIED (Vehicle-Born Improvised Ex- 
plosive Device) attack on an administrative build- 
ing B 


e Prior information: We consider an Individual 
A under surveillance due to previous unstable be- 
havior who drives customized white Toyota (WT) 
vehicle. 


e Observation done at time t — 10 min: From a 
video sensor on road that leads to building B 10 
min ago, one has observed a White Toyota 200m 
from the building B traveling in normal traffic flow 
toward building B. We consider the following two 
sources of information based on this video obser- 
vation available at time t — 10 min: 


— Source 1: An Analyst 1 with 10 years expe- 
rience analyses the video and concludes that 
individual A is now probably near building B. 

— Source 2: An Automatic Number Plate 
Recognition (ANPR) system analyzing same 
video outputs 30% probability that the vehi- 
cle is individual A’s white Toyota. 
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Originally published as Dezert J., Smarandache F., Threat assessment of 
a possible Vehicle-Born Improvised Explosive Device using DSmT, 
(presented during the forum on uncertainty at Fusion 2010 
conference), Edinburgh, Scotland, UK, 26-29 July 2010, and printed 
with permission. 


e Observation done at time t— 5 min: From a 
video sensor on road 15km from building B 5 min 
ago one gets a video that indicates a white Toy- 
ota with some resemblance to individual A’s white 
Toyota. We consider the following thrid source of 
information based on this video observation avail- 
able at time t — 5 min: 


— Source 3: An Analyst 2 (new in post) analy- 
ses this video and concludes that it is improb- 
able that individual A is near building B. 


e Question 1: Should building B be evacuated? 


e Question 2: Is experience (Analyst 1) more valu- 
able than physics (the ANPR system) combined 
with inexperience (Analyst 2)? How do we model 
that? 


NOTE: Deception (e.g., individual A using different 
car, false number plates, etc.) and biasing (on the part 
of the analysts) are often a part of reality, but they are 
not part of this example. 


2 Modeling the VBIED problem 


Before applying DSmT fusion techniques to solve this 
VBIED problem it is important to model the problem 
in the framework of belief functions. 


2.1 Marginal frames with their models 
The marginal frames involved in this problem are: 
e Frame related with individuals: 


©ı = {A = Suspicious person, A = not A} 


e Frame related with the vehicle: 


©2 = {V = White Toyota Vehicle, V = not V} 
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e Frame related with the position of a driver of a car 
w.r.t the given building B: 


©; = {B = near building, B = not B} 


The underlying models of marginal frames are based on 
the following very reasonable assumptions: 


e Assumption 1: We assume naturally AN A = 
Ø (avoiding Shrdinger’s cat paradox). If working 
only with the frame of people O03, the marginal 
bba’s must be defined on the power-set 


291 = {9,,A, A, AU A} 


e Assumption 2: We assume also that VNV = Ø so 
that the marginal bba (if needed) must be defined 
on the power-set 


292 = {o,V,V,V UV} 


e Assumption 3: We assume also that BAB = Ø so 
that the marginal bba (if needed) must be defined 
on the power-set 


29s = {93, B, B, BU B} 


This modeling is disputable since the notion of 
closeness/” near” is not clearly defined and we 
could prefer to work on 


D9: = {03, B N B, B, B, BU B} 


The emptyset elements have been indexed by the 
index of the frame they are referring to for notation 
convenience and avoiding confusion. 


2.2 Joint frame and its model 


Since we need to work with all aspects of available 
information, we need to define a common joint frame 
to express all what we have from different sources of 
information. The easiest way for defining the joint 
frame, denoted ©, is to consider the classical Carte- 
sian (cross) product space and to work with propo- 
sitions (a Lindenbaum-Tarski algebra of propositions) 
since one has a correspondence between sets and propo- 
sitions [5, 6], i.e. 


© = 0] x On x O3 


which consists of the following 8 triplets elements 


© = {6, = (A, V, B), 62 = (A, V, B), 
63 = (A, V, B), 04 = (A, V, B), 
65 = (A, V, B), 06 = (A, V, B), 
07 = (Å, V, B), 0s = (A, V, B)} 
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We define the union U , intersection N as componen- 
twise operators in the following way: 


(x1, £2, £3) U (y1, yo, Y3) = (£1 U y1, £2 U Y2, T3 U ys) 


(£1, £2, £3) N (y1; Y2, Y3) = (£1 N y1, £2 N Y2, £3 N ys) 


The complement X of X is defined in the usual way by 


X = (£1, £2, £3) £ RX} 


where J; is the total ignorance (i.e. the whole space of 
solutions) which corresponds to the maximal element 
defined by I, = (In, Ita, ig), where Is; is the maximal 
(ignorance) of ©;, i = 1,2,3. The minimum element 
(absolute empty proposition) is Ø = (01, 02,03), where 
0; is the minimum element (empty proposition) of 
O;. We also define a relative minimum element in 
S91x92x93 as follows: Ø. = (x,y,z), where at least 
one of the components x, y, or z is a minimal element 
in its respective frame 0;. A general relative minimum 
element Dor is defined as the union/join of all relative 
minima (including the absolute minimum element). 
Similarly to the relative and general relative minimum 
we can define a relative maximum and a general 
relative maximum, where the empty set in the above 
definitions is replaced by the total ignorance. Whence 
the super-power set (S°,,U, ~ ,0,J;) is equivalent to 
Lindenbaum-Tarski algebra of propositions. 


For example, if we consider © = {21,72} and O2 = 
{y1, y2} satisfying both Shafer’s model, then © = ©, x 
O2 = { (x1, y1), (£1, Y2), (£2, Y1), (£2, Y2)}, and one has: 


= (01,02) 
Øri (01,41) 
0-2 (01, y2) 
Or3 = (1, y1 U y2) 
Dra = (x1, 02) 
D5 (x2, 02) 
0,6 = (a1 U z2, 02) 


and thus 
Oey = Ø U Øra U Øra U . . . U Ørg 


Based on definition of joint frame O with operations 
on its elements, we need to choose its underlying model 
(Shafer’s, free or hybrid model) to define its fusion space 
where the bba’s will be defined on. According to the 
definition of absolute and relative minimal elements, 
we then assume for the given VBIED problem that © 
satisfies Shafer’s model, i.e. all (triplets) elements 6; € 
© are exclusive, so that the bba’s of sources will be 
defined on the classical power-set 2°. 
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2.3 Supporting hypotheses for decision 


In the VBIED problem the main question (Q1) is re- 
lated with the security of people in the building B. The 
potential danger related with this building is of course 
63 = (A, V, B) i.e. the presence of A in his/her car V 
near the building B. This is however and unfortunately 
not the only origin of the danger since the threat can 
also come from the possible presence of V (possible A’s 
improvised explosive vehicle) parked near the building 
B even if A has left his/her car and is not himself/her- 
self near the building. This second origin of danger 
is represented by 67 = (A, V, B). There exists also a 
third origin of the danger represented by 06 = (A, V, B) 
which reflects the possibility to have A near the build- 
ing without V car. ĝe is also dangerous for the building 
B since A can try to commit a suicidal terrorism attack 
as human bomb against the building. Therefore based 
on these three sources of potential danger, the most 
reasonable/prudent supporting hypothesis for decision- 
making is consider 


s U 07 U 0g = (A, V, B) U (A, V, B) U (A, V, B) 


If we assume that the danger is mostly due to pres- 
ence of A’s vehicle containing possibly a high charge of 
explosive near the building B rather than the human 
bomb attack, then one can prefer to consider only the 
following hypothesis for decision-making support eval- 
uation 


67 Us = (A, V, B) U (A, V, B) 


Finally if we are more optimistic, we can consider 
that the real danger occurs if and only if A drives V 
near the building B and therefore one could consider 
only the supporting hypothesis 0g = (A, V, B) for the 
danger in the decision-making support evaluation. 


In the sequel, we adopt the worst scenario (we 
take the most prudent choice) and we consider all 
three origins of potential danger. Thus we will take 
06 U 07 U g as cautious/prudent supporting hypothesis 
for decision-making. 


Thepropositions 0g U 0s = (A, V, B) U (A, V, B) and 
06U07 = (A, V, B)U (A, V, B) represent also a potential 
danger and could serve as decision-support hypotheses 
also, and their imprecise probabilities can be evaluate 
easily following analysis presented in the sequel. They 
have not been reported in this paper to keep it at a 
reasonable size. 


2.4 Choice of bba’s of sources 


Let’s define first the bba of each source without re- 
gard to what could be their reliability and importance 
in the fusion process. Reliability and importance will 
be examined in details in next section. 
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Bba related with source 0 (prior information): 
The prior information states that the suspect A 
drives a white Toyota, and nothing is state about 
the prior information with respect to his location, 
so that we must consider the bba’s representing the 
prior information as 


mo(64 U 0s) = mo((A, V, B) U (A, V, B)) 
= mo((A, V, BU B)) 
=1 


Bba related with source 1 (Analyst 1 with 10 
years experience): The source 1 reports that the 
suspect A is probably now near the building B. 
This source however doesn’t report explicitly that 
the suspect A is still with its white Toyota car or 
not. So the fair way to model this report when 
working on O is to commit a high mass of belief to 
the element 0g U 6g, that is 


m1 (4% U 0s) = mı((A, V, B) U (A, V, B)) 
= mı ((A4,V UV, B)) 
= 0.75 


and to commit the uncommitted mass to J; based 
on the principle of minimum of specificity, so that 


mı (O6 U 0s) = 0.75 and mı (14) = 0.25 


Bba related with source 2 (ANPR system): 
The source 3 reports 30% probability that the ve- 
hicle is individual A’s wite Toyota. Nothing is re- 
ported on the position information. The informa- 
tion provided by this source corresponds actually 
to incomplete probabilistic information. Indeed, 
when working on 0; x 02, what we only know 
is that P{(A,V)} = 0.3 and P{(A,V) U (A, V) U 
(A, V)} = 0.7 (from additivity axiom of probabil- 
ity theory) and thus the bba m2(.) we must choose 
on 0; x QO, x ©; has to be compatible with this 
incomplete probabilistic information, i.e. the pro- 
jection m}(.) & m$°r*©?(.) of ma(.) on O1 x Oz 
must satisfy the following constraints on belief and 
plausibility functions 


Bel'((A,V)) = 0.3 
Bel'((A,V) U (4, V) U(A,V)) = 0.7 


and also 
Pl'((A, V)) = 0.3 


PU((A, V) U (A, V) U (A, V)) = 0.7 
because belief and plausibility correspond to lower 


and upper bounds of probability measure [5]. So 
it is easy to verify that the following bba m4(.) 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


satisfy these constraints because the elements of 
the frame ©, x O» are exclusive: 


m((A,V)) = 0.3 
m((A,V) U(A, V) U (A, V)) = 0.7 


We can then extend m}(.) into O1 x O2 x Og using 
the minimum specificity principle (i.e. take the 
vacuous extension of m4(.)) to get the bba ma(.) 
that we need to solve the VBIED problem. That 
is mo(.) = mifOr*92*93(,.) with 


m2((A,V, B U B)) =0.3 
mə((å, V, BUB)U(A, V, BUB)U(A, V, BUB)) = 0.7 
or equivalently 
mə(04 U 0g) = 0.3 
m2(04 U Og) = m2(01 U 02 U 03 U O5 U O6 U 07) = 0.7 


e Bba related with source 3 (Analyst 3 with no 
experience): The source 3 reports that it is im- 
probable that the suspect A is near the building 
B. This source however doesn’t report explicitly 
that the suspect A is still with its white Toyota 
car or not. So the fair way to model this report 
when working on O is to commit a low mass of 
belief to the element 06 U 6g, that is 


ms3(86 U 0s) = ms((A, V, B) U (A, V, B)) 
= ms3((A, VU V, B)) 
= 0.25 


and to commit the uncommitted mass to I; based 
on the principle of minimum of specificity, so that 


m3(0¢ U 03) = 0.25 and m3(I;) = 0.75 


2.5 Reliability of sources 


Let’s identify what is known about the reliability of 
sources and information: 


e Reliability of prior information: it is (implic- 
itly) supposed that the prior information is 100% 
reliable that is "Suspect A drives a white Toy- 
ota” which corresponds to the element (A, V, B) U 
(A, V, B). So we can take the reliability factor of 
prior information as ag = 1. If one considers the 
priori information highly reliable (but not totally 
reliable) then one could take ao = 0.9 so that mo(.) 
would be 


mo(64 U 6s) =0.9 and mo(L) = 0.1 
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e Reliability of source 1: One knows that Analyst 
# 1 has 10 years experience, so we must consider 
him/her having a good reliability (say greater than 
75%) or to be less precise we can just assign to him 
a qualitative reliability factor with minimal num- 
ber of labels in {Lı = not good, Lz = good}. Here 
we should choose a, = Lə. As first approximation, 
we can consider a; = 1. 


e Reliability of source 2: No information about 
the reliability of ANPR system is explicitly given. 
We may consider that if such device is used it is 
because it is also considered as a valuable tool and 
thus we assume it has a good reliability too, that 
is @g = 1. If we want to be more prudent we 
should consider the reliability factor of this source 
as totally unknown and thus we should take it as 
very imprecise with a2 = [0,1] (or qualitatively as 
a2 = |Lo, L3]). If we are more optimistic and con- 
sider ANPR system as reliable enough, we could 
take a2 a bit more precise with ag = (0.75, 1] (i.e. 
az > 0.75) or just qualitatively as ag = Lə. 


e Reliability of source 3: It is said explicitly that 
Analyst 2 is new in post, which means that Ana- 
lyst 2 has no great experience and it can be inferred 
logically that it is less reliable than Analyst 1 so 
that we must choose a3 < a;. But we can also 
have a very young brillant analyst who perform 
very well too with respect to the older Analyst 1. 
So to be more cautious/prudent, we should also 
consider the case of unknown reliability factor a3 
by taking qualitatively ag = [Lo, L3] or quantita- 
tively by taking a3 as a very imprecise value that 
is a3 = [0,1]. 


2.6 Importance of sources 


Not that much is explicitly said about the importance 
of the sources of information in the VBIED problem 
statement, but the fact that Analyst 1 has ten years 
experience and Analyst 2 is new in post, so that it seems 
logical to choose as importance factor 6; > 83. The 
importances discounting factors have been introduced 
and presented by the authors in [2, 7]. As a prudent 
attitude we could choose also 69 = [0,1] = [Lo, La] 
and 82 = [0,1] = [Lo, L3] (vey imprecise values). If 
we consider that the prior information and the source 2 
(ANPR) have the same importance, we could just take 
Bo = 6; = 1 to make derivations easier and adopt a 
more optimistic point of view. 


3 Solution of VBIED problem 


We apply PCR5 and PCR6 fusion rules developed 
originally in the DSmT framework to get the solution 


1Of course the importance discounting factors can also be 
chosen approximatively from exogenous information upon the 
desiderata of the fusion system designer. This question is out 
of the scope of this paper. 
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of the VBIED problem. PCR5 has been developed 
by the authors in [6], Vol.2, and PCR6 is a variant 
of PCR5 proposed by Arnaud Martin and Christophe 
Osswald in [3]. Several codes for using PCR5 and 
PCR6 have been proposed in the literature for example 
in [3, 1, 7] and are available to the authors upon request. 


Two cases are explored depending on the taking into 
account or not of the reliability and the importance of 
sources in the fusion process. To simplify the presenta- 
tion of the results we denote the focal elements involved 
in this VBIED problem as: 


fi = 04 U Os 

fo = 06 U Os 

fa £ 01 U 82 U 03 U O5 U 06 U 07 = 64 U Og 

fa & L = 01 UO U 03 U 84 U Os U 06 U 07 U Og 


Only these focal elements are involved in inputs of 
the problem and we recall the two questions that we 
must answer: 


Question 1 (Q1): Should building B be evacuated? 
The question 1 must be answered by analyzing 

the level of belief and plausibility committed in the 

propositions supporting B through the fusion process. 


Question 2 (Q2): Is experience (Analyst 1) more 
valuable than physics (the ANPR. system) combined 
with inexperience (Analyst 2)? How do we model that? 

The question 2 must be answered by analyzing and 
comparing the results of the fusion mı ® m3 (or even- 
tually mo ® mı ® m3) with respect to mz only (resp. 
mMo (45) mə). 


3.1 Without reliability and importance 


We provide here the solutions of the VBIED problem 
with direct PCR5 and PCR6 fusion of the sources 
for different qualitative inputs summarized in the 
tables below. We also present the result of DSmP 
probabilistic transformation [6] (Vol.3, Chap. 3) of 
resulting bba’s to get and approximate probability 
measure of elements of ©. No importance and reliabil- 
ity discounting has been applied since in this section, 
we consider that all sources have same importances 
and same reliabilities. 


Example 1: We take the bba’s described in section 
2.3, that is 


[ea emen o O e Prat] 


1 
0 07s 
0 
0 ioe 


Table 1: Quantitative inputs of VBIED problem. 





299 


Mmepcr6(.) 


meom 


01 U 2 U 03 U 85 U 86 U 07 


DSmF:, Perel.) 


DSinP..renst) 


BetPpcre(.) 


BetProns() 





Table 4: BetP of mo ® mı ® m2 @ mz for Table 1. 


From fusion result of Table 2, one gets for the danger 
supporting hypothesis 06 U 07 U 6g (the worst scenario 
case) 


e with PCR5: A(s U 87 U 6g) = 0.64596 
P(06 U 07 U Og) €E 
P(06 U 07 U 08) € 


0.35404, 1 
0, 0.64596 
e with PCR6: A(s U 67 U 0s) = 0.61038 








P(06 U 67 U 6g) € [0.38962, 1 

P(06 U 07 U 08) € [0, 0.61038 
where A(X) = PI(X) — Bel(X) is the impre- 
cision related to P(X). It is worth to note 
that A(X) = PI(X) — Bel(X) = A(X) because 
PI(X) = 1 — Bel(X) and Bel(X) = 1 — P(X). 


If we consider only 67 U 0g = (A, V, B) U (A, V, B) 
as danger supporting hypothesis then from the fusion 
result of Table 2, one gets 


e with PCR5: A(07 U 0s) = 0.75625 
P(67 U 03) € [0.24375, 1] 
P(0; Us) € [0, 0.75625] 
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e with PCR6: A(67 U 0g) = 0.75625 
P(67 U 0s) € (0.24375, 1] 
P(0; Uz) € [0, 0.75625] 


If we are more optimistic and we consider only the 
danger supporting hypothesis 6g, then one gets 


e with PCR5: A(0s) = 0.55884 

P(6g) € {0.24375, 0.80259 
P(g) € (0.19741, 0.75625 
e with PCR6: A(6g) = A(0s) = 0.58814 


P(0g) € (0.24375, 0.83189 








P(g) € (0.16811, 0.75625 


If one approximates the bba’s into probabilistic mea- 
sures with DSmP transformation’, one gets results with 
e = 0.001 presented in Table 3. One gets the higher 
probability on 0s with respect to other alternatives and 
also DSmP(06 U67 U6g) = 0.8648. If one prefers to use 
the pignistic? probability transformation [8], one gets 
the results given in Table 4. One sees clearly that PIC* 
of DSmP is higher that PIC of BetP which makes deci- 
sion easier to take with DSmP than with BetP in favor 
of O6 U 07 U As, or 07 U Os, or 0g. 


- Answer to Q1: One sees that the result pro- 
vided by PCR6 and PCR5 are very close and 
do not change fundamentally the final decision to 
take. Based on these very imprecise results, it 
is very difficult to take the right decision with- 
out decision-making error because the sources of 
information are highly uncertain and conflicting, 
but the analysis of lower and upper bounds shows 
that the most reasonable answer to the ques- 
tion based either on max of credibility or max 
of plausibility is to evacuate the building B since 
Bel(0s U 07 U 63) > Bel(Os U07 U 0g) and also 
PI(0s U 07 U 0s) > PI(@gU67 U03). The same 
conclusion is drawn when considering the element 
67 U g or 0g alone. The same conclusion also is 
drawn (more easier) based on DSmP or on BetP 
values. In summary, the answer to Q1 is: Evacu- 
ation of the building B. 


2DSmP transformation has been introduced and justified in 
details by the authors in the book [6] (Vol.3, Chap. 3) freely 
downloadable from the web with many examples, and therefore 
it will not be presented here. 

3BetP is the most used transformation to approximate a mass 
of belief into a subjective probability measure. It has been pro- 
posed by Philippe Smets in nineties. 

4The PIC (probabilistic information content) criteria has been 
introduced by John Sudano in [9] and is noting but the dual of 
normalized Shannon entropy. PIC is in [0,1] and PIC = 1 if the 
probability measure assigns a probability one only on a particular 
singleton of the frame, and PIC = 0 if all elements of the frame 
are equi-probable. 
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In order to answer to the second question (Q2), let’s 
compute the fusion results of the fusion mo ® mz and 
mo Om ms using inputs given in Table 1. The fusion 
results with corresponding DSmP and BetP are given 
in the Tables 5-6. 


01 U 82 U 03 U 85 U 96 U 07 0.28824 0.28824 
04 U Og 0.71176 0.71176 


Table 5: Result of mo ® m2. 





DSinP rons) 


DSmF:, pcre (.) 





Table 6: DSmP. of mo 6 moa. 


BetProns() 


BetPpcre(.) 





Table 7: BetP of mo ® mə. 


Based on mo © Mə fusion result, one gets a total 
imprecision Ao2(06 U 67 U 0s) = 1 when considering 
Os U 07 U 0g or 07 U 0g since 


P(06 U 87 U 6g) € [0, 1] 


P (06 U 67 U 6g) E€ [0, 1] 


and 
P(67 U 0s) E€ [0, 1] 


P(67 U 03) € [0,1] 


Even when considering only the danger supporting hy- 
pothesis 6g, one still gets a quite large imprecision on 
P(6) since Aoe2 (98) = 0.71176 with 


P(6s) € [0, 0.71176] 


P(6g) € [0.28824, 1] 


Based on max of Bel or max of Pl criteria, one sees 
that it is not possible to take any rational decision from 
0g U 07 U 0g nor 67 U 6g because of the full imprecision 
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range of P(0s U 07 U 0g) or P(07 U 0g). The decision 
using mo ® mg (ie. with prior information mp and 
ANPR system m2) based only on supporting hypothesis 
0s should be to NOT evacuate the building B. Same 
decision would be taken based on DSmP or BetP values. 





0s 0.8125 0.8125 
04 U 0s 0.1875 0.1875 


Table 8: Result of mo ® mı @ m3. 


DSmP_ pons) 


DSmF:, pcre (.) 


BetPrcnst) 


BetPpcre(.) 


0.90625 0.90625 





Table 10: BetP of mo 8m, ® mg. 


Based on mo © mı © mg fusion result, one gets 
Aoi3 (06 U 07 U 0g) = 0.1875 


P(9% U@7U 0s) E€ [0.8125, 1] 


P(05 U0; U93) € [0, 0.1875] 


but also 


P(67 Us) € [0.8125,1],  P(@;U0s) € [0, 0.1875] 


and 
P(0s) E P(ðs) € 


Based on max of Bel or max of PI criteria, the deci- 
sion using Mmo Qm Pms (i.e. with prior information mo 
and both analysts) is to evacuate the building B. The 
same decision is taken based on DSmP or BetP values. 
It is worth to note that the precision on the result ob- 
tained with mp ® mı ® mz is much better than with 
mo ® Mə since A013 (Os U 07 U 0s) < Ao2(6 U 07 U 0s), 
or A013(07 U 0s) < Ao2(87 U 0s). Moreover it is easy to 


(0.8125, 1], (0, 0.1875] 
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verify that mo © m1 © mg fusion system is more infor- 
mative than Mmo © mg fusion system because Shannon 
entropy of DSmP of mp @mz is much bigger than Shan- 
non entropy of DSmP of mo 6 mı © ms. 


- Answer to Q2: Since the information obtained 
by the fusion mo © mg is less informative and less 
precise than the information obtained with the 
fusion mp ® mı ® mg, it is better to choose and 
to trust the fusion system mo ® mı ® m3 rather 
than mo Ð mz. Based on this choice, the final 
decision will be to evacuate the building B which 
is consistent with answer to question Q1. 


Example 2: Let’s modify a bit the previous Table 1 
and take higher belief for sources 1 and 3 as 


[Tora cement mat) Tom) [ mat) [mat 





Me 
0 
0.1 


Table 11: Quantitative inputs of VBIED problem. 


The results of the fusion mo 6 mı ® M2 Ð m3 using 
PCR5 and PCR6 and the corresponding DSmP values 
are given in tables 12-13. 


mponst) 


01 U 82 U 63 U 85 U 86 U 07 0.16525 
0s 0.27300 
04 U Og 0.26307 
06 U Og 0.14934 
L 0.14934 


Mecre6(.) 
0.14865 
0.27300 
0.23935 
0.16950 
0.16950 





Table 12: Results of mo 6 mı Ð M2 ® mz for Table 11. 


DSmF:, pcre (.) 


DSP. rows) 





Table 13: DSmP,. of mo 8m, ® m2 Omz3 for Table 11. 


From fusion result of Table 12, one gets when consid- 
ering 06 U 67 U Og 
e with PCR5: A(s U 07 U 6g) = 0.57766 
P(06 U 67 U 08) € (0.42234, 1] 


P(0s U07 Us) € [0, 57766] 
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BetPpcrs(.) BetPpcre(.) 
0i 





Table 14: BetP of mo Ð mı ® m2 @ mz for Table 11. 


e with PCR6: A(67 U 8s) = 0.5575 
P(06 U 07 U 08) € [0.4425, 1] 
P(06 U 67 U 0s) € [0, 0.5575] 
and when considering 07 U 0s 
e with PCR5: A(67 U 88) = 0.7270 


P(07 U 0s) € [0.27300,1], P(87 Us) € [0, 0.7270] 


e with PCR6: A(07 U 0s) = 0.7270 


P(07 Us) € [0.27300,1], P(8z 00s) € [0, 0.7270] 


and when considering 6g only 
e with PCR5 or PCR6: A(0s) = 0.56175 
P(08) € [0.27300, 0.83475] 
P(8g) € [0.16525, 0.7270] 
e with PCR6: A(0s) = 0.57835 
P(6g) € [0.27300, 0.85135] 
P(0) € [0.14865, 0.7270] 


One gets also the following DSmP.=0.901 values 


DSmP. pcrs(0¢ U 67 U 0s) = 0.8861 
DSMP. perels U 07 U 03) = 0.8869 
DSmP. pors(07 U 0g) = 0.8575 
DSmP.,pcre(67 U 0g) = 0.8709 
DSmP. pcrs(0s) = 0.8294 
DSmP.,pcre(6s) = 0.8455 


- Answer to Q1: Using an analysis similar to the 
one done for Example 1, based on max of credibil- 
ity or max of plausibility criteria, or by considering 
the DSmP or BetP values of 0g U07 U@, or 07 Us, 
or 0g the decision to take is: Evacuate the build- 
ing B. 
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In order to answer to the second question (Q2) for 
this Example 2, let’s compute the fusion results of the 
fusion mp © mz and mp Ð Mı Ð mz using inputs given 
in Table 11. Since the inputs mp and mz are the same 
as those in Example 1, the mo © mz fusion results with 
corresponding DSmP are those already given in Tables 
5-7. Only the fusion mo © mı Ð m3 must be derived 
with the new bba’s mı and mg chosen for this Example 
2. The mo ® mı ® mz fusion results obtained with 
PCR5 and PCR6, and the corresponding DSmP and 
BetP values are shown in Tables 15-17. According to 
these results, one gets with the PCR5 or PCR6 fusion 
mo BM, ® mg: Ao013(06 U 07 U 0s) = Ao013(07 U 63) = 
Aois (6s) = 0.09 and 





P(06 U 07 U 08) E [0.91,1 ; P(06 U 07 U Og) = [0, 0.09] 
P(67 U 8s) € [0.91, 1], 


P(6s) € [0.91,1], 


P(0; U03) € [0, 0.09] 


P(0s) € [0, 0.09] 





mre 


Mecr6(.) 


DSmP. rows) 


DSmP., Porel.) 


BetPronst) 


BetPpcre(.) 





Table 17: BetP of mp 8 mı Oms3. 


Based on max of Bel or max of Pl criteria, the deci- 
sion using Mp ® m1 © mg (i.e. with prior information 
mo and both analysts) is to evacuate the building B. 
Same decision is taken based on DSmP or BetP values. 
It is worth to note that the precision on the result ob- 
tained with mp ® mı ® mz is much better than with 
mo ® m2 since Aoi3 (0) < Ape (0), or Aoi3 (67 U 0s) < 
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Ao2(67 U 0s), or A013 (06 U 7U Og) < Ao2 (6 U07 U 0s). 
Moreover it is easy to verify that mo ® mı © m3 fusion 
system is more informative than mp Mə fusion system 
because Shannon entropy of DSmP of mo @ mz is much 
bigger than Shannon entropy of DSmP of mp m1 ms. 
Same remark holds with BetP transformation. 


- Answer to Q2: Since the information obtained 
by the fusion mo ® mz is less informative and less 
precise than the information obtained with the fu- 
sion Mp ® mı ® mz, it is better to choose and to 
trust the fusion system mo ® mı ® Mg rather than 
mo ® mz. Based on this choice, the final decision 
will be to evacuate the building B which is consis- 
tent with the answer of the question Q1. 


3.2 Impact of prior information 


To see the impact of the quality/reliability of prior 
information on the result, let’s modify the input mo(.) 
in previous Tables 1 and 11 and consider now a very 
uncertain prior source. 


Example 3: We consider the very uncertain 
prior source of information mo(04 U 0s) = 0.1 
and mo(I;) = 0.9. The results for the modified inputs 
Table 18 ate given in Tables 19 and 20. 


[Tora cement m Fa) [ mat) [mat 





of 0.3 
0. 5 0 0. 2 
0.7 
0. = 0 0. ° 
Table 18: Quantitative inputs of VBIED problem. 


mecal 


O6 0.511870 
01 U 02 U 03 U 05 U 06s U O7 | 0.151070 
0.243750 
0.060957 
0.016173 
0.016173 


Mepcre6(.) 
0.511870 
0.142670 
0.243750 
0.059757 
0.020973 
0.020973 


DSmF:, pcre (.) 


DSmP_ pons) 





Table 20: DSmP. of mo B mı 6 M2 6 mz for Table 18. 
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Table 21: BetP of mo 6m, 6 m2 6 mz for Table 18. 


From the fusion result of Table 19, one gets when con- 
sidering 06 U 67 U 0s 


e with PCR5: A(6¢ U 67 U 08) = 0.221377 
P(06 U 07 U 0g) € [0.778623, 1] 
P(06 U 67 UO) € [0, 0.221377] 
e with PCR6: A(6¢ U 67 U 08) = 0.21465 
P(06 U 07 U 0s) € [0.78535, 1] 
P(06 U 07 UO) € [0, 0.21465] 
and when considering 67 U 0s 
e with PCR5: A(07 U 0s) = 0.24438 
P(07 U 08) € 


P(07 Us) € 


0.24375, 0.48813 
0.51187, 0.75625 
e with PCR6: A(67 U 08) = 0.24438 
P(07 U 08) € 
P(07 U8) € 


0.24375, 0.48813 








0.51187, 0.75625 





and when considering 0g only, one has 


e with PCR5: A(0s) = 0.09331 
P(g) € 
P(g) € 
e with PCR6: A(0s) = 0.10171 
P(6g) € 
P(Og) € 


Using DSmP transformation, one gets a low probability 
in Og or in 67 U 0g because 





0.24375, 0.33706 
0.66294, 0.75625 


0.24375, 0.34546 
0.65454, 0.75625 








DSmP., pcrs(06 U 07 U 03) = 0.9985 
DSmP. pcre(06 U 07 U 03) = 0.9986 
DSmP. pcrs(67 U 0g) = 0.3152 
DSmP., pcre(67 U 6g) = 0.3171 
DSmP. pcrs(0s) = 0.3149 

DSmP. perels) = 0.3168 
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- Answer to Q1: The analysis of these results are mpcrs(.) | mpcre(.) 
very interesting since one sees that the element of 0.08125 0.08125 
the frame © having the highest DSmP (or BetP) 0.01875 0.01875 
is 06 = (A, V, B) and it has a very strong impact See recent 
on the final decision. Because if one considers only - - 





0s or 07 U 0g has decision-support hypotheses, one Table 25: Result of mo ®m1@ m3 
sees that the decision to take is to NOT evacuate 
the building B since one gets a low probability in DSmP.,pcrs(.) | DSmPz,pcre(.) 


0g or in 07U6g. Whereas if we include also 0g in the 
decision-support hypothesis, then the final decision 
will be the opposite since DSmP(0gU07UOs) is very 
close to one with PCR5 or with PCR6. The same 
behavior occurs with BetP. So there is a strong im- 
pact of prior information on the final decision since 
without strong prior information supporting 04U 0s 
we have to conclude either to the non evacuation 
of building B based on the max of credibility, the 
max of plausibility or the max of DSmP using 0g 
or 67U6g for decision-making, or to the evacuation BetP.pors(.) | BetP.,pcre() 
of the building if a more prudent strategy is used 
based on 6g U 07 U 0s decision-support hypothesis. 





Table 26: DSmP. of mo 8m, ® msz. 


Let’s examine the results of fusion systems mp ® m2 
and mp Ð Mı Ð Mg given in Tables 22-27. 





01 U 02 U 63 U 05 U 06 U 87 0.69125 0.69125 
04 U 08 0.30875 0.30875 


Table 22: Result of mo ® mg. 


Table 27: BetP of mo 8m, ® msz. 


Ao2(96 U 07 U Og) = 1 and 
DSmP.,pors(.) | DSmPz,pore(.) 
P(9¢6 U 67 U 0s) € [0, 1] 
P(06 U 87 U 03) € [0, 1] 
for 67 U @g, one has also Ao2(07 U 6g) = 1 with 


P(07 U 03) € [0,1] 





P(67 U 03) € [0,1] 


a A Tia: and for 0g, one gets Ao2(08) = 0.30875 with 


BaP | BaP P(8s) € [0, 0.30875] 


P(Og) € [0.69125, 1] 


One sees that it is impossible to take a decision 
when considering only 0g U 07 U 0g or 67 U 0s because 
of full imprecision of the corresponding probabilities. 
However, based on max of Bel or max of Pl criteria 
on 6g the decision using mo ® mg (i.e. with uncertain 
prior information mo and ANPR system mg) is to 

Table 24: BetP of mo ® mg. NOT evacuate the building B. According to Tables 
23-24, one sees also an ambiguity in decision-making 

Based on mo ® mz fusion result, one gets a large between 6g and 64 since they have the same DSmP (or 
imprecision on evaluation of probabilities of decision- BetP) values. 
support hypotheses since for 6g U 07 U 6g, one has 
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Based on mp Mı © mz fusion results given in Tables 
25-27, one gets for 6g U7 Us the imprecision A013 (46 U 
07 U 6g) = 0.1875 with 


P(9% U@7U 0s) E [0.8125, 1] 


P(06 U 07 U 88) € [0, 0.1875] 
for 67 U Og, one gets Ao13(07 U 08) = 0.91875 with 


P(67 Us) € [0.08125, 1] 


P(07 U 03) € [0, 0.91875] 
and for 0s, one gets Ao13(88) = 0.91875 with 


P(63) E 
P(ĝs) € [0, 0.91875) 


Based on max of Bel or max of PI criteria, the deci- 
sion using Mom mz is the evacuation of the building 
B. Same decision is drawn when using DSmP or BetP 
results according to Tables 26 and 27. With this un- 
certain prior information, it is worth to note that the 
precision on the result obtained with mo ® mı ® Mg is 
better than with mp @®mz when considering (in cautious 
strategy) the decision-support hypotheses 06 U 07 U 0s 
or 07 U8 since A013 (O6 U07 Uĝs) < Ape (O6 U07 U 0s), or 
Ao013(07 U 08) < Ao2(07 U 0g). However, if a more opti- 
mistic/risky strategy is used when considering only 6 
as decision-support hypothesis, it is preferable to choose 
the subsystem mo ® m2 because Aoe2 (6s) < Ao013(08). 
However, one sees that globally mo 6 mı ® mg fusion 
system is more informative than mp Mə fusion system 
because Shannon entropy of DSmP of mo @ mz is much 
bigger than Shannon entropy of DSmP of mopmı ms. 


(0.08125, 1] 


- Answer to Q2: The answer of question Q2 is 
not easy because it depends both on the crite- 
rion (precision or PIC) and on the decision-support 
hypothesis we choose. Based on precision crite- 
rion and taking the optimistic point of view using 
only 6g, it is better to trust mo © mg fusion sys- 
tem since Ao (6s) = Ao2(08) = 0.30875 whereas 
A013 (08) = A013 (08) = 0.9187. In such case, one 
should NOT evacuate the building B. If we con- 
sider that is better to trust result of mp mı ® M3 
fusion system because it is more informative than 
mo ® mg then the decision should be to evacuate 
the building B. If we take a more prudent point of 
view in considering as decision-support hypotheses 
either 0g U07 U 0g or 67 Us, then the final decision 
taken according to the (most precise and informa- 
tive) subsystem mo ® m1 ® mz is to evacuate the 
building B. 

So the main open question is what solution to 
choose for selecting either mom or M9 PM OM3 
fusion system ? In authors opinion, in such case 
it seems better to base our choice on the precision 
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level of information one has really in hands (rather 
than the PIC value which is always related to some 
ad-hoc probabilistic transformation) and in adopt- 
ing the most prudent strategy. Therefore for this 
example, the final decision must be done according 
to Mmo Ð Mı © Mg, i.e. evacuate the building B. 


Example 4: Let’s modify a bit the previous input Ta- 
ble 18 and take higher belief for sources 1 and 3 as 


[Toca cement Pmt) [ra [mat aC 





ra 
ia 
0 
0.9 
Table 28: Quantitative inputs of VBIED problem. 


mpons() 


O6 0.573300 
01 U 02 U 03 U 05 U 06 U 07 | 0.082365 
0s 0.273000 
04 U 08 0.030666 
O6 U 08 0.020334 
L 0.020334 


Mecr6(.) 
0.573300 
0.077355 
0.273000 
0.029951 
0.023197 
0.023197 





Table 29: Result of mo 8 mı ® M2 @ mz for Table 28. 


DSmF:, pcre (.) 


DSmP. rows) 


BetPpcre(.) 


BetProns() 


Table 31: BetP of mp ® mı © m2 @ mz for Table 28. 
Therefore, one gets when considering 0g U 07 U 08 
e with PCR5: A(s U 67 U Og) = 0.133366 
P(0¢ U 67 U 08) € (0.866634, 1] 


P(05 U 07 Us) € [0, 0.133366] 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


e with PCR6: A(06 U 07 U 0s) = 0.130503 
P(06 U 67 U 0s) € [0.869497, 1] 
P(06 U 07 U 8g) € [0, 0.130503] 
and when considering 07 U 0s 
e with PCR5: A(07 U 0s) = 0.1537 
P(07 U 08) € [0.2730, 0.4267] 
P(07 U03) € [0.5733, 0.7270] 
e with PCR6: A(07 U 0s) = 0.1537 
P(07 U 08) € [0.2730, 0.4267] 
P(07 U03) € [0.5733, 0.7270] 
and when considering 6g only 
e with PCR5: A(0s) = 0.071335 
P(03) € [0.2730, 0.344335] 
P(@) € [0.655665, 0.7270] 
e with PCR6: A(0s) = 0.076345 
P(08) € [0.2730, 0.349345] 
P(@) € [0.650655, 0.7270] 
Based on DSmP transformation, one gets a pretty 
low probability on 0s and on 67 U 6g, but a very high 


probability on the most prudent decision-support fy- 
pothesis 0g U 67 U 0g because 


DSmP. pcrs(0¢ U 07 U 63) = 0.9992 
DSMP. perels U 07 U 03) = 0.9993 
DSmP. pors(07 U 03) = 0.3168 
DSmP. pcre(67 U 03) = 0.3180 
DSmP. pors(0s) = 0.3166 

DSmP. pore(0s) = 0.3178 


- Answer to Q1: Based on these results, one sees 
that the decision based either on the max of credi- 
bility, the max of plausibility or the max of DSmP 
considering both cases 0g or 67 U @g is to: NOT 
Evacuate the building B, whereas the most pru- 
dent/cautious strategy suggests the opposite, i.e. 
the evacuation of the building B. 


Let’s examine the results of fusion systems mp ® m2 
and mp Mı P mz corresponding to the input Table 28. 
Naturally, one gets same results for the fusion mp ® m2 
as in Example 3 and for the fusion mp 6 mı ® m3 one 
gets: 
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Mecr6(.) 


mons) 





Table 32: Result of mo 6m, @ Mg. 


DSP. ons) 


DSmP.,pore(.) 





Table 33: DSmP. of mo 8m, ® m3. 


BetPronst) 


BetPpcre(.) 





Table 34: BetP of mo 8m, Oms3. 


As in Example 3, based on mo @mz fusion result, one 
gets a large imprecision on evaluation of probabilities 
of decision-support hypotheses since for 6g U07U 08, one 
has Ao2(6 U 07 U 0s) = ] and 


P(06 U 67 U 6g) E€ [0, 1] 


P(86 U 67 U 0s) € [0, 1] 
for 67 U @g, one has also Ao2(07 U 6g) = 1 with 


P(67 U 03) € [0,1] 


P(67 U 03) € [0,1] 
and for 0g, one gets Ao2(08) = 0.30875 with 


P(03) € [0, 0.30875] 


P(0s) € [0.69125, 1] 


One sees that it is impossible to take a decision when 
considering only 0e U 07 U 0g or 07 U 0s because of full 
imprecision of the corresponding probabilities. Based 
on max of Bel or max of Pl criteria on g the decision 
using Mmo ® mg is to NOT evacuate the building B. 
According to Tables 23-24, an ambiguity appears in 
decision-making between 0g and 64 since they have the 
same DSmP (or BetP) values. 
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Based on mo © mı © mg fusion result, one gets 
Aoi3(96 U 07 U 0s) = 0.09 for O6 U 07 U Og with 


P(96 U 67 U 0s) E [0.91, 1] 


P(W 00; U0) € [0, 0.09] 
and A013 (07 U Os) = 0.9090 for 67 U 6g with 


P(67U 6g) € (0.091, 1] 


P(8;U0s) € (0, 0.9090] 
and Aoi3(6s) = 0.9090 for only 0g with 


P(6g) € (0.091, 1] 


P(0s) € (0, 0.9090] 


Based on max of Bel or max of PI criteria on either 0s, 
07U63 or OgU07U6g the decision using mp @m16m3 must 
be the evacuation of the building B. Same decision is 
drawn using DSmP or BetP results according to Tables 
33 and 34. 


- Answer to Q2: Similar remarks and conclusions 
to those given in Example 3 held also for Exam- 
ple 4, i.e. it is better to adopt the most prudent 
strategy (i.e. to consider 6g U 67 U 0g as decision- 
support hypothesis) and to trust the most precise 
fusion system with respect this hypothesis, which 
is in this example the subsystem mp © mı © m3. 
Based only on mo ® mı Ð m3 the final decision 
will be to evacuate the building B when one has in 
hands such highly uncertain prior information mo. 


3.3 Impact of no prior information 


Example 5: Let’s examine the result of the fusion 
process if one doesn’t include® the prior information 
mo(.) and if we combine directly only the three sources 
Mı Ð M2 ® Mz altogether with PCR5 or PCR6. 


0 0.3 0 





Table 35: Quantitative inputs of VBIED problem. 


mponst) 


O6 0.56875 
01 U 02 U 03 U 05 U O6 U 07 0.13125 


Mepcre6(.) 
0.56875 
0.13125 
0.24375 
0.05625 


Os 0.24375 





04 U 08 0.05625 


Table 36: Result of mı 6 m2 6 Ms for Table 35. 


50r equivalently we can take mo as the vacuous bba corre- 
sponding to mo(J¢) = 1 and to the fully ignorant prior source. 
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DSmP.,pors(.) | DSmP.,pcre(.) 
04 


BetProns() 


BetPpcre(.) 





Table 38: BetP of mı 6 m2 6 mg for Table 35. 


One gets A (06 U 67 U 08) = 0.1875 for 0s U 67 U Og with 
P(06 U 67 U 0s) € [0.8125, 1] 


P(06 U 07 U 88) € [0, 0.1875] 
for 67 U @g, one has also A(@7 U 0g) = 0.1875 with 


P(67 U 6s) € (0.24375, 0.43125] 


P(07 U @g) € [0.56875, 0.75625] 
and for @g, one gets A(6g) = 0.05625 with 


P(g) € [0.24375,0.3], P(8s) € [0.7, 0.75625] 
The result presented in Table 36 is obviously the same 
as the one we obtain by combining the sources mo © 
mı Ð m2 ® mz altogether when taking mo(.) as the 
vacuous belief assignment, i.e. when mo(J;) = 1. 


- Answer to Q1: Based on results of Tables 36-38 
the decision based on max of belief, max of plau- 
sibility on either 6g or 07 U 0g is to NOT evacuate 
building B. Same conclusions is obtained when an- 
alyzing values of DSmP or BetP of 6g or 67 U 0s. 
However, if we adopt the most prudent strategy 
based on decision-support hypothesis 0g U 07 U 68 
the decision will be to evacuate the building B 
since DSmP (06 U 07 U 0g) = 0.9989. So we see the 
strong impact of the miss of prior information in 
the decision-making support process (by compar- 
ison between Example 1 and this example) when 
adopting more risky strategies for decision-making 
based either on 67 U 0s or on g only. 
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Let’s compare now the source mz with respect to the 
mı msz fusion system when no prior information mo is 
used. Naturally, there is no need to fusion mz since we 
consider it alone. One has A2(06 U 67 U 6g) = A2 (07 U 
63) = 1 (i.e. the full imprecision on P(0g U 07 U6) and 
on P(07 U 6g)) whereas A2(08) = 0.3 with 


P(6s) € [0,0.3], P(@s) € [0.7, 1] 


DSmP and BetP of m2(.) are the same since there is no 
singleton as focal element for m2(.) - see Table 39. 


DSimP_poRst) 


BetP(.) 





Table 39: DSmP-. and BetP of mo. 


Based on max of Bel or max of Pl criteria on 0g (op- 
timistic/risky strategy) the decision using mz (ANPR 
system alone) should be to NOT evacuate the building 
B. No decision can be taken using decision-support 
hypotheses 6, U 67 U 0g or 07 U 0s, nor on DSmP or 
BetP values since there is an ambiguity between 6 
and 04. 


Now if we combine mı with m3 using PCR5 or 
PCR6 we get® results given in Tables 40 and 41. 


06 U Og 0.8125 0.8125 
l 0.1875 0.1875 


Table 40: Result of mı 6 m3. 





DSmP pons) 


DSmP., porel.) 





Table 41: DSmP. of mı ® mg. 


The values of BetP(.) are same as those of DSmP(.) 
because there is no singleton as focal element of 
My (45) m3. 


6Note that for two sources, PCR6 equals PCR5 [6], Vol.2. 
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Whence A13(06 U 07 U 0s) = 0.1875 with 
P(06 U 67 U 0s) € [0.8125, 1] 


P(0s U 07 U03) € [0, 0.1875] 
and A13 (07 U 0s) = ] with 


P(07 U 0s) E€ (0, 1], P(07 U 03) € [0,1] 
and also A13(08) = 1 with 
P(0s) € [0,1], P(0s) € [0,1] 


Based on max of Bel or max of P1 criteria on 0g U07U 
0g, the decision using mı @ M3 must be the evacuation 
of the building B. Same decision must be drawn when 
using DSmP results according to Table 41. No decision 
can be drawn based only on 6g or on 07 U 6g because of 
full imprecision on their corresponding probabilities. 


- Answer to Q2: Similar remarks and conclusions 
to those given in Example 3 held also for Example 
5, ie. it is better to trust the most precise source 
for the most prudent decision-support hypothesis, 
and to decide to evacuate the building B if one 
has no prior information rather than using only 
information based on APNR system. 


Example 6: It can be easily verified that the same 
analysis, remarks and conclusions for Q1 and Q2 as for 
Example 5 also hold when considering the sources mı 
and m3 corresponding to the following input Table 





Table 42: Quantitative inputs of VBIED problem. 


3.4 Impact of reliability of sources 


The reliability of sources (when known) can be easily 
taken into using Shafer’s classical discounting technique 
[5], p. 252, which consists in multiplying the masses of 
focal elements by the reliability factor a, and trans- 
ferring all the remaining discounted mass to the full 
ignorance ©. When a < 1, such very simple reliability 
discounting technique discounts all focal elements with 
the same factor a and it increases the non specificity of 
the discounted sources since the mass committed to the 
full ignorance always increases. When a = 1, no relia- 
bility discounting occurs (the bba is kept unchanged). 
Mathematically, Shafer’s discounting technique for tak- 
ing into account the reliability factor a € [0,1] of a 
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given source with a bba m/(.) and a frame © is defined 
by: 

E (1) 

Mo(O) =a-m(O) + (1-a) 

Example 7: Let’s consider back the inputs of Table 
28. The impact of strong unreliability of prior informa- 
tion mo has already been analyzed in Examples 3 and 
4 by considering actually a9 = 0.1. Here we analyze 
the impact of reliabilities of sources mo, m1, M2 and 
mg according presentation done in section 2.4 and we 
choose the following set of reliability factors œo = 0.9, 
a, = 0.75, ag = 0.75 and a3 = 0.25. These values 
have been chosen approximatively but they reflect the 
fact that one has a very good confidence in our prior 
information, a good confidence in sources 1 and 2, and 
a low confidence in source 3. Let’s examine the change 
in the fusion result of sources. Applying reliability 
discounting technique [5], the new inputs correspond- 
ing to the discounted bba’s by (1) are given in Table 43. 


k =a:m(X), frX #0 


focal clement 





T in 
0. oa 
0. i 0. a 


Table 43: Discounted inputs with ap = 0.9, a, = 0.75, 
a2 = 0.75 and a3 = 0.25. 


Mepcre6(.) 
0.030967 
0.11037 
0.26543 
0.33686 
0.068147 
0.18822 


mponst) 


0.030967 
0.13119 
0.26543 
0.37256 

0.063483 
0.13637 


DSmP: Porel.) 


DSinPrenst) 





Table 45: DSmP. of mo 8m, Ð M2 Omz3 for Table 43. 


From the fusion result of Table 44, one gets when 
considering 6, U 67 U 0s: 


e with PCR5: A(0¢ U 67 U 6s) = 0.64012 


P(0¢ U 67 U 03) € [0.35988, 1] 
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BetPpcre(.) 


BetPronst) 





Table 46: BetP of mo 8m, 6 m2 6 mz for Table 43. 


P(06 U 67 U 8g) € [0, 0.64012] 
e with PCR6: A( U 67 U 8s) = 0.635456 
P(06 U 07 U 88) € [0.364544, 1] 
P(06 U 07 UO) € [0, 0.635456] 
and when considering 07 U 0s 
e with PCR5: A(z U 8s) = 0.703603 
P(67 U 85) € [0.26543, 0.969033] 
P(07 U03) € [0.030967, 0.73457] 
e with PCR6: A(z U 8s) = 0.703603 
P(67 U 85) € [0.26543, 0.969033] 
P(07 U03) € [0.030967, 0.73457] 
and when considering 6g only 
e with PCR5: A(s) = 0.572413 
P(0g) € 
P(@s) € 
e with PCR6: A(s) = 0.593233 
P(0g) € 
P(@s) € 


Using DSmP transformation, one gets high probabil- 
ities in 0g U 07 U 6g, 07 U Og and in 0g because 


(0.26543, 0.837843] 


(0.162157, 0.73457] 


(0.26543, 0.858663] 


(0.141337, 0.73457] 


DSmP.,pcrs(96 U 07 U 0g 0 
DSmP., pcre (96 U 07 U 8g) = 0.9837 
DSmP., pcrs(67 U Ag 0 





DSmP. pcre (67 U 0s = 0.8302 
DSmP;: PCR5 Og = 0.8126 
DSmP; PCR6 A = 0.8266 
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- Answer to Q1: Based on these results, one sees 
that the decision to take is to evacuate the build- 
ing B since one gets a high probability in decision- 
support hypotheses. Same conclusion is drawn 
when using max of Bel of max of Pl criteria. So 
there is a little impact of reliability discounting on 
the final decision with respect to Example 1. It is 
however worth to note that introducing reliability 
discounting increases the non specificity of infor- 
mation since now J; is a new focal element of ma, 
and of Ma, and in the final result we get the new 
focal element 6, appearing with PCR5 or PCR6 
fusion rules. This ĝe focal element doesn’t exist in 
Example 1 when no reliability discounting is used. 
The decision to take in this case is to: Evacuate 
the building B. 


To answer to the question Q2 for this Example 7, 
let’s compute the fusion results of the fusion mg © m2 
and mo ®m1 Ð Mg using inputs given in Table 43. The 
fusion results with corresponding DSmP are given in 
the Tables 47-50. 


0.74842 


0.74842 
0.22658 
0.025 


0.22658 
0.025 





Table 47: Result of mo Ð mg. 


DSmP_ rons) 


DSmF:, pcre (.) 





Table 48: DSmP. of mo ® mg. 


The result of BetP transformation is the same as 
with DSmP transformation since there is no singleton 
element as focal element of the resulting bba’s when 
using PCR5 or PCR6 fusion rule. 


Based on mo © mg fusion result, one gets a large 
imprecision’ on P(@g) since Ago (6s) = 0.77342 with 


P(0g) € (0, 0.77342] 


P(g) € (0.22658, 1] 


’This imprecision is larger than in Example 1 which is normal 
because one has degraded the information of both prior and the 
source m2. 
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and one gets a total imprecision when considering either 
07 U 0g or 06 U 07 U g since 


P(67 U 0s) E (0, 1], P(67 U 0s) E [0, 1] 


P(9% U 67 U ðs) E€ (0, 1], P(9% U 67 U @g) = (0, 1] 


Based on max of bel or max of Pl criteria the 
decision using mo ® mg (i.e. with discounted sources 
mo and ANPR system mz) should be to NOT evacuate 
the building B when working with decision-support 
hypothesis 6g. No clear decision can be taken when 
working with 6g U 07 U 0g or 67 U 0g. Ambiguity in 
decision-making occurs between 6g = (A,V,B) and 
64 = (A, V, B) when using DSmP or BetP transforma- 
tions. 


Let’s examine now the result of the mp © mi © mz 
fusion given in Tables 49 and 50. 


mponst) 


0.53086 


Mecr6(.) 
0.53086 
0.36914 
0.058984 
0.041016 


0.36914 


0.058984 
0.041016 





Table 49: Result of mo 6m, @ m3. 


DSinP rons) 


DSmP.,pore(.) 





Table 50: DSmP. of mo 8m, ® msz. 


BetPrcns() 


Bet Ppcre(.) 





Table 51: BetP of mp 8m, Oms3. 


One sees clearly the impact of reliability discount- 
ing on the specificity of information provided by the 
fusion of sources. Indeed when using the discounting 
of sources (mainly because we introduce J; as focal el- 
ement for mo) one gets now 4 focal elements whereas 
we did get only two focal elements when no discounting 
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was used (see Table 9). Based on mp ®m1 ® mz fusion 
result, one gets A013(06 U 07 U 88) = 0.410156 with 


P(0¢ U 67 U 4s) € [0.589844, 1] 


P(05 U 07 U 8s) € (0, 0.410156] 
and also Aoi3 (67 U 63) = 0.46914 with 


P(67 U 03) € (0.53086, 1] 


P(8; U05) € [0, 0.46914] 
and A013 (Og) = 0.46914 with 


P(0g) € (0.53086, 1] 


P(g) € [0, 0.46914] 


Based on max of Bel or max of Pl, the decision using 
mo ® Mı ® mz should be to evacuate the building B. 
Same decision would be taken based on DSmP values. 
It is worth to note that the precision on the result ob- 
tained with mo ® mı ® mz is much better than with 
mo BmMe since Aoi3(938) < Ao2(08), or A013 (07 U 0s) < 
Ao2(07 U 8s), or A013 (06 U 7U 0s) < Ao2(06 U 07 U Og). 
Moreover it is easy to verify that mo ® m1 ® m3 fusion 
system is more informative than mp Mə fusion system 
because Shannon entropy of DSmP of mo @ mz is much 
bigger than Shannon entropy of DSmP of mp m1 ms. 


- Answer to Q2: Since the information obtained 
by the fusion mo ® mz is less informative and less 
precise than the information obtained with the 
fusion mo ® Mı ® mg, it is better to choose and to 
trust the fusion system mo ® mı ® Mg rather than 
mo ® mz. Based on this choice, the final decision 
will be to evacuate the building B. 


3.5 Impact of importance of sources 


The importance discounting technique has been pro- 
posed recently by the authors in [7] and consists in 
discounting the masses of focal elements by a factor 
B € [0,1] and in transferring the remaining mass to 
empty set, i.e. 


ma(X) =B-m(X), for X #0 o 

mp(0) = B+ m(Ø) + (1 — 8) 
It has been proved in [7] that such importance dis- 
counting technique preserves the specificity of the 
information and that Dempster’s rule of combination 
doesn’t respond to such new interesting discounting 
technique specially useful and crucial in multicriteria 
decision-making support. 


In the extreme case, the method proposed in [7] re- 


inforces the highest mass of the focal element of the 
source having the biggest importance factor as soon as 
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the other sources have their importance factors tending 
towards zero. This reinforcement may be a disputable 
behavior. To avoid such behavior, we propose here to 
use the same importance discounting technique, but the 
fusion of discounted sources is done a bit differently in 
three steps as follows: 


e Step 1: Discount each source with its importance 
discounting factor according to (2). 


e Step 2: Apply PCR5 or PCR6 fusion rule with 
unnormalized discounted bba’s, i.e. as if the dis- 
counted mass committed to empty set for each 
source was zero. 


e Step 3: Normalize the result to get the sum of 
masses of focal elements to be one. 


Let’s examine the impact of the importance of the 
sources in the fusion process for final decision-making 
through the next very simple illustrating example. 


Example 8: To evaluate this we consider the same 
inputs as in Table 1 and we consider that source 1 
(Analyst 1 with 10 years experience) is much more im- 
portant than source 3 (Analyst 2 with no experience). 
To reflect the difference between importance of this 
sources we consider the following relative importance 
factors 8, = 0.9 and 63 = 0.5. We also assume that 
source 0 (prior information) and source 2 (ANPR 
system) have the same maximal importance, i.e. 
Bo = b2 = 1, ie B = (bo, B1, G2, 83) = (1,0.9, 1, 0.5). 
These values have been chosen approximatively but 
they do reflect the fact that sources mo and mg 
have same importance in the fusion process, and 
that sources mı and mg may have less importance 
in the fusion process taking into the fact that ms 
is considered as less important than mı.  Let’s ex- 
amine the change in the fusion result of sources in 
this example with respect to what we get in Example 1. 


In applying importance discounting technique [7] 
with the aforementioned fusion approach, the new in- 
puts corresponding to the discounted (unnormalized) 
bba’s by (2) are given in Table 52. 


focal element 





Table 52: Discounted inputs with 65 = 1, 6, = 0.9, 
Bo = 1 and B3 = 0.5. 


and the fusion result is given in Table 53. 
As we can see, the importance discounting doesn’t 
degrade the specificity of sources since no mass is 
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mponst) 


0.24375 


Mmepcre6(.) 
0.24375 
0.33034 
0.14132 
0.19186 

0.092734 


0.36788 


0.10552 
0.21814 
0.064701 





Table 53: Result of mg, D mg, P Mg, D Mg- 


committed to partial ignorances, and it doesn’t also in- 
crease the number of focal elements of the resulting bba 
contrariwise to the reliability discounting approach. 
Indeed in Table 53 one gets only 5 focal elements 
whereas one gets 6 focal elements with reliability 
discounting as shown in Table 44 of Example 7. The 
corresponding DSmP and BetP values of bba’s given 
in Table 53 are summarized in Tables 54 and 55 . 


DSmP. pons) 


DSmP.,pcre(.) 


BetPpcre(.) 


BetPronst) 





Table 55: BetP of mg, D meg, P Mg, D Mazzy. 


From the fusion result of Table 53, one gets when con- 
sidering 66 U 67 U 68 


e with PCR5: A(6 U 67 U 6g) = 0.65073 
P(06 U 07 U 08) € [0.34927, 1] 
P(05 U 07 U 85) € [0, 0.65073] 

e with PCR6: A(6 U 67 U 08) = 0.61493 
P(06 U 07 U 03) € [0.38507, 1] 
P(06 U 67 UO) € [0, 0.61493] 


and when considering 07 U 0s 
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e with PCR5: A(07 U 0s) = 0.75625 


P(67 U 03) € [0.24375, 1] 


P(0; Us) € [0, 0.75625] 
e with PCR6: A(07 U 0s) = 0.75625 


P(67 U 03) € [0.24375, 1] 





P(87 U 85) € [0, 0.75625] 
and when considering 0g only 
e with PCR5: A(@g) = 0.53811 
P(03) € (0.24375, 0.78186] 
P(Og) € (0.21814, 0.75625] 
e with PCR6: A(#g) = 0.564393 
P(6g) € (0.24375, 0.80814] 
P(g) € (0.19186, 0.75625] 


Using DSmP transformation, one gets a high proba- 
bility in decision-support hypotheses because 


DSmP. pcrs(06 U 67 U 03) = 0.8517 
DSmP.,pcre(06 U 07 U 03) = 0.8688 
DSmP.,pcrs(67 U 03) = 0.8147 
DSmP. pcre(67 U 0g) = 0.8359 
DSmP.,pcrs(0s) = 0.7781 

DSmP. perels) = 0.8036 





- Answer to Q1: Based on these results, one sees 
that the decision to take based either on max of 
Bel or max of Pl on 0g U 67 U 0g, on 07 U Og or on 
0s only, or also based on DSmP, is to evacuate the 
building B. 


To answer to the question Q2 for this Example 8, 
let’s compute the fusion results of the fusion mg, P mg, 
and mg, P mg, D Mg, using inputs given in Table 52. 
Because one has considered 69 = 82 = 1, one does not 
actually discount sources mg and mz and therefore the 
mo ® mg fusion results are already given in Tables 5 
and 6 of Example 1. Therefore based on max of Bel or 
max of P] criteria on 0g the decision using mp @mz is to 
NOT evacuate the building B since P(6g) € [0, 0.71176] 
and P(6g) € [0.28824, 1] and Ag2(6g) = 0.71176. Same 
decision would be taken based on DSmP values with 
the mo © mz fusion sub-system. 

Let’s now compute the fusion mg, mg, P Mg, with 
the importance discounted sources mg,=1 = Mo, Mg, 
and mg,. The fusion results are given in Tables 56, 57 
and 58. 
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pons) 


Mecre6(.) 


DSmP_ pons) 


DSmP., Porel.) 


BetPronst) 


BetPpcre(.) 





Table 58: BetP of mg, D me, P Mp. 


Based on mg, © mg, P Mp, fusion result, one gets 
same result with PCR5 and PCR6 in this case, and one 
gets A013 (Os U 07 U 0s) = 0.1875 with 


P (06 U@7U 0s) € [0.8125, 1] 


P(0s U 07 U 03) € [0, 0.1875] 
and A013 (07 U 0s) = 0.1875 with 


P(67 U 0s) € [0.8125, 1] 


P(0; Us) € [0, 0.1875] 
and also Aoi3(98) = 0.1875 with 


P(0s) € (0.8125, 1] 


P(0s) € (0, 0.1875] 


Based on max of Bel or max of P], the decision taken 
using 0g U07 Ug, 07 U@g or 0g for the mg, P mg, D Mg 
fusion sub-system should be to evacuate the building 
B. Same decision must be taken based on DSmP or 
BetP values. It is worth to note that the precision® on 
the result obtained with subsystem mg, P Mg, P Mg, is 
much better than with subsystem mg,=1 ®mg,=1 since 
(Aoi3 (08) = 0.1875) < (Apa (8s) = 0.71176), and also 
(Ao13 (07 U 0s) = 0.1875) < (Ao2 (07 U 0s) = 1), and 
(A013 (86 U07 Ug) = 0.1875) < (Apa (96 U07 Ug) = 1). 


8see Example 1 for the numerical results of MBy=1 B MBy=1- 
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- Answer to Q2: The analysis of both fusion sub- 
systems 4,1 PMg,=1 and mg, Omg, bmg, shows 
that the mg, E Mg, mg, subsystem must be cho- 
sen because it provides the most precise results and 
therefor the decision will be to evacuate the build- 
ing B whatever the decision-support hypothesis is 
chosen 06 U 07 U 0g, 67 U Og or Og. 


3.6 Using imprecise bba’s 


Let’s examine the fusion result when dealing di- 
rectly with imprecise bba’s. We just consider here a 
simple imprecise example which considers both inputs 
of Examples 1 and 2 to generate imprecise bba’s inputs. 


Example 9: We consider the imprecise bba’s accord- 
ing to input Table 59. 


fi = 04 U 08 0 0.3 0 


f2 = 06 U 0s [0.10,0.25] 


fs = 04 U Os ; 0 
I (0.75,0.9] 





Table 59: 
problem. 


Imprecise quantitative inputs for VBIED 


Applying the conjunctive rule, we have 1x2x2x2=8 
products to compute which are listed below: 








e Product mı = 1 © [0.75, 0.90] © 0.3 & (0.10, 0.25]. 
Using operators on sets defined in [6], Vol.1, Chap. 
6, one gets mı = [0.75,0.90] © [0.03,0.075] = 
(0.0225, 0.0675] which is committed to fiN fo = 0s. 


0.3 © [0.75, 0.90] 
(0.225, 0.27] = 
committed to 















































e Product m2 = 1 © [0.75, 0.90] © 
is equal to [0.75,0.90] © 
[0.16875, 0.243] which is also 
FLA fo = bs. 


e Product 73 = 1 © [0.75, 0.90] © 0.7 c [0.10, 0.25] = 
[0.0525, 0.1575] corresponds to the imprecise mass 
of fi N f2 N f3 = 0 which will be redistributed back 
to fı, fo and f3 according to PCR6. 


e Product 74 = 1G (0.75, 0.90] © 0.7 c [0.75, 0.90] = 
(0.39375, 0.567] corresponds to the imprecise mass 
of fı N fe fs NL; = which will be redistributed 
back to fi, fo, fg and I; according to PCR6. 




























































































e Product 75 = 1G (0.10, 0.25] © 0.3 © (0.10, 0.25] = 
0.003, 0.01875] is committed to fı N fo = Os. 

e Product mę = 1 © (0.10, 0.25] © 0.3 c (0.75, 0.90] = 
0.0225, 0.0675] is committed to fi. 

e Product 77 = 1 [0.10, 0.25] © 0.7 © (0.10, 0.25] = 
0.007, 0.04375] corresponds to the imprecise mass 








of fi A NA f3 N fo = 9 which will be redistributed 
back to fi, fo, f3 and I according to PCR6. 
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e Product 7g = 1 (0.10, 0.25) © 0.7 © (0.75, 0.90] = 
(0.0525, 0.1575] corresponds to the imprecise mass 
of fi NLA fs AL; = 0 which will be redistributed 
back to fi, f3 and I; according to PCR6. 











We now redistribute the imprecise masses 73, 74, 77 
and 7 associated with the empty set using PCR6 prin- 
ciple. Lets’ compute the proportions of 73, 74, 77 and 
Tg committed to each focal element involved in the con- 
flict they are associated with. 


e The product 73 = [0.0525, 0.1575] is distributed to 
fi, f2 and fs according to PCR6 as follows 












































































































































T firs = Y f2,T3 = Z f3,T3 
i (0.75, 0.90] Œ [0.10,0.25] 0.7 
T3 
~ 1 (0.75, 0.90] Œ [0.10, 0.25] Œ 0.7 
(0.0525, 0.1575] _ [0.0525, 0.1575] 
~ 17 (0.85,1.15] (2.55, 2.85] 
0.0525 0.1575 
~ 985? 2.55 
= (0.018421, 0.061765] 
whence 
Tfn = 1 E (0.018421, 0.061765] 
= (0.018421, 0.061765] 
Yforg = (0-85, 1.15] © (0.018421, 0.061765] 
= (0.015658, 0.071029] 
Zjara = 0.7 © (0.018421, 0.061765] 


= (0.012895, 0.043236] 


e The product 74 = [0.39375, 0.567] is distributed to 
fi, fo, fg and I; according to PCR6 as follows 





































































































fima _ _ Yfzma — fama _ Wet 
1 (0.75, 0.90] 0.7 (0.75, 0.90] 
T4 
Ti [0.75, 0.90] E 0.7 Œ [0.75, 0.90] 
[0.39375, 0.567] 
-y 
= [0.1125, 0.177188] 
whence 
Lf, rq = 1G (0.1125, 0.177188] 
= (0.1125, 0.177188] 
Yfo,r4 = [0.75, 0.90] © (0.1125, 0.177188} 
= (0.084375, 0.159469] 
Zfs,nq = 0.7 © (0.1125, 0.177188] 
= (0.07875, 0.124031] 
Wh, na = [0.75, 0.90] © (0.1125, 0.177188] 
= (0.084375, 0.159469] 
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e The product 77 = [0.007, 0.04375] is distributed to 
fi, fo, fg and I; according to PCR6 as follows 


































































































Ufimr _ _ Y fase — fsm _ _ Vhr 
1 [0.10,0.25] 0.7 — [0.10, 0.25] 
= T7 
-1E [0.10, 0.25] & 0.7 & [0.10, 0.25] 
__ [0.007, 0.04375 
[1.9, 2.2] 
= [0.003182, 0.023026] 
whence 

Tf,ny = 1 E [0.003182, 0.023026] 
= [0.003182, 0.023026 

Vfo.n, = [0-10, 0.25] © [0.003182, 0.023026] 
= [0.000318, 0.005757 

Zf3,n7 = 0.7 E [0.003182, 0.023026] 
= [0.002227, 0.016118 

WI,,n; = [0.10, 0.25] © [0.003182, 0.023026] 
= [0.000318, 0.005757 








e The product 7g = [0.0525, 0.1575] is distributed to 


































































































fi, fg and I; according to PCR6 as follows 
fms _ Zane _ Whe 
1 0.7 (0.10, 0.25] Œ [0.75, 0.90] 
T8 
~ 150.7 (0.10, 0.25] & [0.75, 0.90] 
(0.0525, 0.1575] 
= 5552.85) 
= (0.018421, 0.061765] 
whence 
Tfins = 1 (0.018421, 0.061765] 
= (0.018421, 0.061765] 
Zfz,ns = 0.7 © [0.018421, 0.061765] 
= (0.012895, 0.043235] 
WI, ng = [0.85, 1.15] © (0.018421, 0.061765] 


= (0.015658, 0.071029] 


Summing the results, we get for mo m1 ® M2 ® M3 
with PCR6 the following imprecise mpcre bba: 





























mpcre(9g) = Tı 
0.19425, 0.32925] 


T2 T5 

























































































mpcore(fi) = T fi,nas ON, rg OV fy rz OV fy rg 
= (0.152524, 0.323743 

mpcre(fa2) = Vfo.73 OU fo,ra OU fo, a7 
= (0.100351, 0.236255 

mpore(f3) = 2fs,3 H 2 fs,n4 HZ fs,07 E Zf, rs 
= (0.106767, 0.226620 

mporo(lt) = Wire H W ,r7 




















WI, rg 


0.100351, 0.236255 
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where J and Z operators (i.e. the addition, mul- 
aplicat on and division of imprecise values), and other 
operators on sets, were defined in [6], Vol. 1, p127-130 




















decision-support hypotheses which are 


ie Bel(06 U 67 U 6g) = [0.294601, 0.565505 
P1(6 U 67 U 6g) = [0.654243, 1.352123] = (0.654243, 1] 

S1 E S2 = {x|z = sı + 82,51 € S1, S2 € S2} Bel(67 U 03) = [0.194250, 0.329250 
Pl(67 U 6g) = [0.654243, 1.352123] = [0.654243, 1] 

Sı O S2 = {a|x = sı - 82,81 € 91,82 € S2} Bel(0g) = [0.194250, 0.329250 
Pl(6) = 0.547476, 1.125502] = (0.547476, 1] 


S1 B S = {z]x = 81/82, S1 E€ S1, 82 E S2} 







































































with Therefore, one gets the following imprecision ranges 
A A AS S for probabilities 
S1 Sq) = S S. 
paplota Saupe eu) P(05 U 07 U 0s) € [0.294601, 1 
and P(06 U 07 U 88) € [0, 0.705399 
P(07 Ug) € [0.194250, 1 
inf(S1 S2) = inf (S1) x inf (S2) 
E (SiVesup(Sy) P(07 U 03) € [0, 0.805750 
su : = su “su 
pret pret ea he IE ee P(6s) € [0.194250, 1 
and P(6s) € [0, 0.805750 
inf(.S; A S2) = inf(S S: 
EA AS =e sipi 2) Based on max of Bel or max of Pl criteria, one sees 
sup(Si Ø S2) = sup(S1)/ inf sup(S2) 


We have summarized the results in Table 60. The left 
column of this table corresponds to the imprecise val- 
ues of Mpcre based on exact calculus with operators 
on sets (i.e. the exact calculus with imprecision). The 
right column of this table (m#@ing ) corresponds to 
the result obtained with non exact calculus based on 
results given in Examples 1 and 2 in right columns of 
Tables 2 and 12. This is what we call approximate 
results since they are not based on exact calculus 
with operators on sets. One shows an important 
differences between results in left and right columns 
which can make an impact on final decision process 
when working with imprecise bba’s and we suggest to 
always use exact calculus (more complicated) instead 
of approximate calculus (more easier) in order to get 
the real imprecision on bba’s values. Same approach 
can be done for combining imprecise bba’s with PCR5 
(not reported in this paper). 


focal element 


mene C) 


[0.24375,0.27300 
[0.23935,0.29641] 
(0.14587, 0.16950] 
[0.14865,0.16811] 
(0.14587, 0.16950] 


mpore6(.) 
[0.194250,0.329250] 
[(0.152524,0.323743] 
[(0.100351,0.236255] 
[(0.106767,0.226620] 
[(0.100351,0.236255] 





Table 60: Results of mo 6m, 6 m2 6 mz for Table 59. 


Based on results on left column of Table 60, one can 
easily compute the imprecise Bel and P1 values also of 
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that the decision will be to evacuate the building B 
whatever the decision-support hypothesis we prefer 
06 U 07 U 0s, 07 U Ag or Og. 


Let’s compute now the imprecise DSmP values for 
e = 0.001. The focal element fı = 64 U s is redis- 
tributed back to 64 and 6g directly proportionally to 
their corresponding masses and cardinalities 


Yos 

[0.194250, 0.329250] H 0.001 
= mpc re (94 U 0s) 

0.002 H [0.194250, 0.329250] 
__ (0.152524, 0.323743] 
= 0.196250, 0.331250] 
_ 0.152524 0.323743 
E 0331250" 0.196250! 
= [0.46045, 1.64965] 


LO4 


00.001 — 
























































whence 





zo, = 0.001 © (0.46045, 1.64965] 
= (0.000460, 0.001650] 
= (0.195250, 0.330250] E 
= (0.089903, 0.544797] 























(0.46045, 1.64965] 


The focal element f2 = 0gU@g is redistributed back to 
Os and 0s directly proportionally to their corresponding 
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masses and cardinalities 














206 E Yon 
080.001 — (0.194250, 0.329250] H 0.001 
= mpcre6 (96 U 0s) 
0.002 H [0.194250, 0.329250] 
(0.100351, 0.236255] 
(0.196250, 0.331250] 


= (0.302943, 1.20385] 



























































whence 
Zo, = 0.001 E (0.302943, 1.20385] 
= [0.000303, 0.001204] 
Yos = [0.195250, 0.330250] © [0.302943, 1.20385] 











= [0.0591496, 0.3975714 


The focal element f3 = 04 U 0g which is also equal to 
01 U 62 U 03 U 05 U 06 U 67 is redistributed back to 0, 
02, 03, 05, Os and 07 directly proportionally to their 
corresponding masses and cardinalities 


































































































W6, fae Woa i W93 = Wos 
00.001 00.001 00.001 00.001 
=, Woe = Wor 
00.001 00.001 
_ mpcre(G4 U 0s) 
= 0.006 
__ (0.106767, 0.226620] 
0.006 

= [17.7945, 37.770] 

Since all are equal, we get 

Wo, = Wo, = Woz = Wos = Wos = Wo7 


= 0.001  [17.7945, 37.770] 
= [0.0177945, 0.03777] 


The total ignorance I, = 01 U02 U03 U04U05 Us U87 Ub 
is redistributed back to all eight elements of the frame 
O directly proportionally to their corresponding masses 
and cardinalities 




















































































































V6, = VO = V63 = VO4 
0H 0.001 00.001 00.001 0H 0.001 
00.001 00.001 0H 0.001 
(0.194250, 0.329250] H 0.001 
_ mpcre6 (Lt) 
(0.194250, 0.329250] Œ 0.008 














_ (0.100351, 0.236255] 


~ [0.202550, 0.337250 
= (0.297557, 1.16813] 
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whence 








V6, = Voz = Voz = Voa = Uos = Vos 
= 0.001 & (0.297557, 1.16813] 
= (0.000298, 0.001168] 

Uo = (0.195250, 0.330250] E 


= (0.058098, 0.385775] 


Vaz 




















[0.297557, 1.16813] 








The imprecise DSmP probabilities are computed by 


































































































DSmP; pors(01) = wo, ve, 
DSmP.,pcr6(92) = wo, H vo, 
DSmP; pors(03) = wo, E vos 
DSmP; pors(04) = £o, Æ vo, 
DSmF.,pcre(@5) = wo; E vos 
DSmF., pcr6(96) = 20, E we, H vos 
DSmF.,pcr6(97) = wo, E ve, 
DSmFP,pcr6(9s) = Yo, E Yg, E Vos 

















which are summarized? in Table 61 below. 


DSmP_ poral) 


0.0181,0.0389 
0.0181,0.0389 
0.0181,0.0389 
0.0008 ,0.0028 


0.0181,0.0389 

0.0184 0.0402 

0.0181,0.0389 
(0.2072,1] 











Table 61: Imprecise DSmP. of Mmo 8m, M2 ® Mg for 
Table 59. 


- Answer to Q1: As we have shown, it is possi- 
ble to fuse imprecise bba’s with PCR6, and PCR5 
too (see [6], Vol. 2) to get an imprecise result 
for decision-making support under uncertainty and 
imprecision. It is also possible to compute the ex- 
act imprecise values of DSmP if necessary. Accord- 
ing to our analysis and our results, and using either 
the max of Bel, the max of Pl of the max of DSmP 
criterion, the decision will be to evacuate the build- 
ing B. Of course, a similar analysis can be done 
to answer to the question Q2 when working with 
imprecise bba’s, and for for computing imprecise 
BetP values as well. 


4 Qualitative approach 


In this section we just show how the fusion and decision- 
making can be done using qualitative information ex- 
pressed with labels. In our previous examples the quan- 
titative baa’s have been defined ad-hoc in satisfying 


9 Actually for 0g, one gets with exact calculus of imprecision 
DSmP., pcr6(98) = [0.2072, 1.3281], but since a probability can- 
not be greater than 1, the upper bound of imprecision interval 
has been set to 1. 
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some reasonable modeling and using minimal assump- 
tion compatible with what is given in the statement 
of the VBIED problem. The numerical values can be 
slightly changed (as we have shown in Examples 1 and 
2, or in Examples 3 or 4) or they can even be taken 
as imprecise as in Example 9, but they still need to be 
kept coherent with sources reports in order to obtain 
what we consider as pertinent and motivated and fully 
justified answers to questions Q1 and Q2. 

In this section we show how to solve the problem 
using qualitative information using labels. We investi- 
gate the possibility to work either with a minimal set 
of labels {Lı = Low, La = High} (ie. with m = 2 
labels) or a more refined set consisting in 3 labels 
{Lı = Low, La = Medium, La = High} (i.e. with 

= 3 labels). Each set is extended with minimal 
Lo and maximal Lm+1ı labels as follows (see [6], Vol.3, 
Chap. 2 for examples and details) 


Lo = {Lo = 0, Li = Low, Lo = High, L3 — 1} 
and 


Ls = {Lo = 0, Lı = Low, Lz = Medium, 
L3 = High, L4 = 1} 


To simplify the presentation, we only present the re- 
sults when combining directly the sources altogether 
and considering that they have all the same maximal re- 
liability and importance in the fusion process. In other 
words, we just consider the qualitative counterpart of 
Example 1 only. 


4.1 Fusion of sources using L2 


Example 10: When using Lo, the qualitative inputs!° 
of the VBIED problem are chosen according to Table 
62. 





Table 62: Qualitative inputs using £2. 


Using DSm field and linear algebra of refined labels 
based on equidistant labels assumption, one gets the 
following mapping between labels and numbers Lo = 0, 
Lı = 1/3, Lə = 2/3 and L3 = 1 and therefore, the 
Table 62 is equivalent to the quantitative inputs table 
63 (which are close to the numerical values taken in 
Example 1). 

Applying PCR5 and PCR6 fusion rules, one gets the 
results given in Table 64 for quantitative and approxi- 
mate qualitative bba’s. 

10When dealing with qualitative information, we prefix the no- 


tations with ’q’ letter, for example quantitative bba m(.) becomes 
qualitative bba qm(.), etc. 
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0 1/3 0 
2/3 0 1/3 
0 2/3 0 
1/3 0 2/3 


Table 63: Corresponding quantitative inputs. 


focal element | mpcrs ~ qmpcrs | IMPCR6 ~ IMPCRE 


0.25926 = Lı 0.25926 = Li 
0.36145 = Ly 0.3157 = Lı 
0.13198 + Lo 
0.16108 + Lo 
0.13198 + Lo 


ooor 





0.093855 & Lo 
0.19158 = Lı 
0.093855 = Lo 





Table 64: Results of mo 6 mı Ð M2 ® mz for Table 62. 


One sees that the crude approximation of numerical 
values to their closest corresponding labels in £2 can 
yield to unnormalized qualitative bba. For example, 
qmpcre(.) is not normalized since the sum of labels 
of focal elements in the right column of Table 64 is 
Lı + Lı + Lo + Lo + Lo = Le # L3. To preserve the 
normalization of qbba result it is better to work with 
refined labels as suggested in [6], Vol.3, Chap. 2. Using 
refined labels, one will get now a better approximation 
as shown in the Table 65. 


focal element | mpcrs ~% qmpors | MPCR6 ~ IMPCRE 


0.25926 = Lo.79 
0.36145 y Lios 


0.25926 = Lo.79 
0.31570 + Lo.95 
0.13198 = Lo.39 
0.16108 =~ Lo .48 
0.13198 + Lo.39 


0.09385 = Lo.28 
0.19158 ye Lo.57 
0.09385 = Lo.28 





Table 65: Results of mo 6m, 6 m2 6 mz with refined 
labels. 


It can be easily verified that the qbba’s based on 
refined label approximations are now (qualitatively) 
normalized (because the sum of refined labels of each 
column is equal to L3). 


The results of qDSmP based on refined and crude 
approximations are given in Table 66. 
qDSmP. poral) 
0.0323 = Lo. on Lo 
0.0323 x% Lo. on Lo 
0.0323 = Lo. on Lo 
0.0017 = Lo.00 xX Lo 
0.0323 & Lo. 
0.0326 & Lo. 


qDSmF., pcre (.) 
0.0273 % Lo.08 
0.0273 % Lo.08 y 
0.0273 & Lo.08 © 
0.0017 = Lo.01 X 
0.0274 + Lo.08 
0.0279 + Lo.08 
0.0273 % Lo.08 y 
0.8338 ~ L2.51 © 


o & Lo 


o & Lo 
0.0323 + Lo. o7% Lo 
0.8042 ~ L240 S Le 





Table 66: Results of gcDSmP, for Table 62. 


Answer to Q1 using crude approximation: Based 
on these qualitative results, one sees that using crude 
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approximation (i.e. using only labels in £2) one gets?! 


e with qgPCR5 

















qgP(06 U 07 U 0s) € [L1, L3 
qP (06 U 07 U93) € [Lo, Le 
qP(07 U 08) € [Li, Ls 

qP (07 U 88) € [Lo, L2 

qP (03) € [L1, L2 

qP (8s) € [Li, Le 

e with qPCR6 

(8s U 67 U 08) € [L1, Lo 
(8s U 67 U Ag) € [L1, Lo 
qP(07 U bs) € [L1, Lo 
qP (07 U 88) € [L1, Le 
qP(03) € [L1, Lo 
qP(8s) € [L1, Le 











These results show that is is almost impossible to 
answer clearly and fairly to the question Q1 using 
the max of Bel or the max of Pl criteria based on 
such very inaccurate qualitative bba’s using such 
crude approximation. However it is possible and 
easy to answer to Q1 using qualitative DSmP value. 
However and according to Table 66, the final decision 
must be to evacuate the building B when consider- 
ing the level of DSmP values of 06 U07U08, @7U6s, or 0s. 


Answer to Q1 using refined approximation: Us- 
ing the refined approximation using refined labels which 
is more accurate, one gets 


e with qPCR5 


P(06 U 07 U 08) € [L1.07, L3] 
P(06 U07 Ug) € [Lo, L1.93] 
P(07 Us) € [Lo.79, L3] 
P(67 U @s) € [Lo, L2.21] 
qP(6) € [Lo.79, £2.43] 
qP(s) € [Lo.57, £2.21] 








11The derivations of qBel(X) and qPI(X) were obtained us- 
ing qualitative extension of Dempster’s formulas [5], i.e. with 
qBel(X) = Lm — qPI(X) and qPI(X) = Lm — qBel(X). These 
results are valid only if the qbba is normalized, but are used here 
even when using non normalized qbba as crude approximation. 


318 


e with qPCR6 





gP(0¢ U 07 U 0s) € [L1.18, £3] 
qP (06 U 07 U 0s) € [Lo, £1.89] 
qP(07 Us) € [Lo.79, L3] 
qP(07 Us) € [Lo, L2.21] 

qP (08) € [Lo.79, L2.52] 

qP (0s) € [Lo.48, L2.21] 





One sees that accuracy of the result obtained using re- 
fined labels allows us to take the decision more easily. 
Indeed, using the refined approximation, it is possible 
here to take the decision based on the max of Bel, or 
on the max of Pl and whatever the decision-support hy- 
pothesis used (6g U07 U@sg, or 67 U@s, or 4g), the answer 
to question Q1 is: Evacuation of the building B. 
The same decision can also be taken from the analy- 
sis of qDSmP values as well when considering refined 
labels in Table 66. 


4.2 


Here we propose to go further in our analysis and 
to use a bit more refined set of labels defined by £3. 
We need to adapt the qualitative inputs of the VBIED 
problem in order to work with Ls. 


Fusion of sources using Ls 


Example 11: We propose to solve the VBIED problem 
for the following qualitative inputs which reflects what 
is reported by the sources when using labels belonging 
to L3. 





Table 67: Qualitative inputs based on £3. 


Based on the equidistant labels assumption, one gets 
the following mapping between labels and numbers 
Lo = 0, Li = 1/4, Lo = 2/4, Ls = 3/4 and L4 =] 
and therefore, the Table 67 is equivalent to the quan- 
titative inputs table 68 (which are more close to the 
numerical values taken in Example 1 than the inputs 
chosen in Table 63 for Example 10). 





1 0 0.25 0 
0 0.75 0 0.25 
0 0 0.75 0 
0 0.25 0 0.75 


Table 68: Corresponding quantitative inputs. 


Applying PCR5 and PCR6 fusion rules, one gets the 
results given in Table 69 for quantitative and approxi- 
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mate qualitative bba’s using refined and crude approx- 
imations of labels. 


038 0.20312 = Lo gi 0.20312 = Lo gi % Lı 
04 U ôg 0.34269 ~ L1.37 0.29979 ~ L1. 21 = Ly 
06 U Og 0.11617 © Lo.47 0.15370 + Lo.g1 © Ly 
0.22185 + Lo.gg © Ly 0.18969 = Lo.76 ~ Lı 
0.15370 ~ Lo.g1 & Li 


04 U 0g 
It 0.11617 © Lo 47 X Lo 





Table 69: Results of mo 6m, 6 m2 6 mz for Table 67. 


It can be easily verified that the qbba’s based on re- 
fined label approximations are (qualitatively) normal- 
ized because the sum of refined labels of each column is 
equal to L4. Using crude approximation when working 
only with labels in £3 we get non normalized qbba’s. 
The results of qDSmP based on refined and crude ap- 
proximations are given in Table 70. 


gDSmP. Pcrs(:) gDSmP. pcre ) 


Lo.13 % Lo 


0.0375 = Lo.15 ~ Lo | 0.0323 
0.0375 = Lo.15 © Lo 
0.0375 = Lo.15 © 
0.0022 = Lo.o1 
0.0375 = Lo.15 
0.0381 ~% Lo.15 
0.0375 = Lo.15 © 
0.7722 ~ L3 09 ~ L3 


0.0323 
0.0323 
0.0022 


Lo.13 © Lo 
Lo.13 © Lo 
Lo.01 ~ Lo 
Lo.13 © Lo 
0.0331 = L0 13 © Lo 
0.0323 © Lo. 13 & Lo 
0.8032 ~ L3 21 = L3 


0.0323 


UUUUUVUR 





Table 70: Results obtained with qDSmP;, for Table 67. 


Ones sees that the use of refined labels allows to 
obtain normalized qualitative probabilities. This is 
not possible to get normalized qualitative probabilities 
when using only crude approximations with labels in 
L3 for this example. 


Answer to Q1: Using refined labels (which is more 
accurate), one gets finally 


e with qPCR5 











qP (8s U 87 UO) € [L1.28, La] 
qP (06 U07 U As) € [Lo, L2.72] 
qP(07 U Og) € [Lo.s1, La] 
qP (07 Us) € [Lo, L3.19] 

qP (0s) € [Lo.81, £3.12] 

qP(03) € [Lo.ss, L3.19] 

e with qPCR6 

qP (06 U 07 U Og) € [L1.42, La] 
qP (06 U07 U08) € [Lo, L2.58] 
qP(07 U Os) € [Lo.s1, La] 
qP(67 U 8g) € [Lo, L3.19] 

qgP(6s) € [Lo.81, £3.24] 

qP(03) € [Lo.76, £3.19] 





One sees that based with PCR5 or PCR6 whatever 
the decision-support hypothesis we consider (06 U07U0s, 
07 U 8g, or 6g), one will decide to evacuate the building 
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B based on max of Bel, max of Pl or DSmP values, 
except for the case of PCR5 with 6g based on the max 
of Bel, max of Pl. In this case, PCR5 result suggests 
to NOT evacuate B contrariwise to PCR6 result. As 
far as 0g is the preferred (optimistic) decision-support 
hypothesis, one sees here the main effect of difference 
between PCR5 and PCR6 for decision-making support. 
But as already stated, for such problem the most pru- 
dent strategy for decision-making is to consider the 
decisio-support hypothesis 6g U 67 U 0s which captures 
all aspects of potential danger. Using such reasonable 
strategy, both rules PCR5 and PCR6 yields same deci- 
sion: Evacuation of the building B. 


5 Conclusions 


In this paper we have presented a modeling for 
solving the Vehicle-Born Improvised Explosive Device 
(VBIED) problem with Dezert-Smarandache Theory 
(DSmT) framework. We have shown how it is possi- 
ble to compute imprecise probabilities of all decision- 
support hypotheses and how to take into account the 
reliabilities and the importances of the sources of infor- 
mation in decision-making support. The strong impact 
of prior information has also been analyzed, as well as 
the possibility to deal directly with imprecise sources 
of information and even with qualitative reports. We 
have answered with the full justification to the two main 
questions asked in the VBIED problem by John Lav- 
ery and Simon Maskell: 1) what is the final decision to 
take, and 2) what is the best fusion subsystem to choose 
(APNR or the pool of experts)? The analysis done in 
this paper is based on a very limited number of reason- 
able assumptions and could be adapted for solving more 
complicated security problems involving imprecise, in- 
complete and conflicting sources of information. 
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A PCR-BIMM filter For Maneuvering Target Tracking 


Jean Dezert 
Benjamin Pannetier 


Abstract — In this paper we show how to correct 
and improve the Belief Interacting Multiple Model fil- 
ter (BIMM) proposed in 2009 by Nassreddine et al. 
for tracking maneuvering targets. Our improved al- 
gorithm, called PCR-BIMM is based on results devel- 
oped in DSmT (Dezert-Smarandache Theory) frame- 
work and concerns two main steps of BIMM: 1) the 
update of the basic belief assignment of modes which is 
done by the Proportional Conflict Redistribution Rule 
no. 5 rather than Smets? rule (conjunctive rule); 2) 
the global target state estimation which is obtained from 
the DSmP probabilistic transformation rather than the 
commonly used Pignistic transformation. Monte-Carlo 
simulation results are presented to show the perfor- 
mances of this PCR-BIMM filter with respect to clas- 
sical IMM and BIMM filters obtained on a very simple 
maneuvering target tracking scenario. 

Keywords: Tracking, IMM 
BIMM, DSmT. 


Maneuvering target, 


a 


1 Introduction 


In Fusion 2009 international conference, Nassreddine, 
Abdallah, and Denceux [13] have proposed an inter- 
esting idea to extend the classical Interacting Multiple 
Models (IMM) filter with belief function theory in order 
to deal with an unknown and variant motion models. 
Their algorithm is based on the classical/historical be- 
lief function theory developed by Shafer in 1976 [14], 
known as Dempster-Shafer Theory (DST) and requires 
both Smets’ rule, i.e. the conjunctive fusion rule equiv- 
alent to the non normalized Dempster’s rule, and the 
probabilistic pignistic transformation. This algorithm 
is called Belief Interacting Multiple Model algorithm 
(BIMM). According to authors results, BIMM algo- 
rithm outperforms classical IMM algorithm at least in 
the vehicle localization problem studied in their works. 
These appealing results and the possible extension of 
IMM in belief function theory framework motivates our 
interest to analyze and evaluate this new BIMM filter. 
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Originally published as Dezert J., Pannetier B., A PCR-BIMM filter for 
maneuvering target tracking, in Proc. of Fusion 2010, Edinburgh, Scotland, 
UK, 26-29 July 2010, and reprinted with permission. 


A deep analysis of the paper yields to the following 
comments: 


1. The derivation of the predicted prior basic belief 
assignment of modes in Step 1 of BIMM algorithm 
was clearly wrong in [13] as proved in the sequel. 
This mistake implies a serious doubt on the validity 
of the results presented in [13]. 


2. The simulations results presented in [13] cannot 
be verified precisely, nor reproduced, because some 
settings parameters (like a; discounting factors) re- 
quired for the BIMM filter have not be provided 
by the authors and the essential step 9 of the algo- 
rithm was not detailed enough. 


3. It is known (see Chapter 1 of [15] Vol. 3) that 
the conjunctive rule does not perform efficiently in 
a sequential fusion process because the empty set 
is an absorbing element for the conjunctive fusion 
rule. Therefore, in order to implement successfully 
the BIMM filter, some ad-hoc numerical techniques 
are necessary (or some extra normalization steps) 
in the BIMM algorithm in order to prevent the 
mass of belief committed to empty set to become 
close to one and make Smets’ rule responding to 
new information. This serious problem has unfor- 
tunately not been discussed in [13]. 


From the theoretical point of view, it is quite surpris- 
ing that one gets better performances with the BIMM 
(which proceeds with less specific information since it 
deals with non Bayesian basic belief assignments) than 
with the classical Bayesian IMM filter (which deals with 
more specific information, i.e. with Bayesian basic be- 
lief assignments). The first purpose of this work is 
to verify if the conclusions given in [13] are valid on 
a very simple reproducing maneuvering target track- 
ing scenario. We want also to see if a more justified 
Belief-based IMM algorithm can be developed to im- 
prove the BIMM algorithm and to evaluate it to get 
a fair comparison of its performance with respect to 
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classical IMM filter. The improvement of the BIMM 
algorithm we propose in this paper is based on ad- 
vanced theoretical results obtained in the development 
of Dezert-Smarandache Theory (DSmT) of information 
fusion [15]. This paper is organized as follows: After a 
brief recall of classical (fixed structure) IMM algorithm 
given in section 2, one presents in section 3 the Belief 
IMM algorithm and its flaws. Motivations for the im- 
provement of the BIMM filter is presented in section 
4 with the presentation of the main steps of our new 
algorithm called PCR-BIMM filter (Proportional Con- 
flict Redistribution-based BIMM). In section 5, we ex- 
amine the performances of the IMM, and PCR-BIMM 
on a very simple tracking scenario through Monte-Carlo 
results. Conclusions and perspectives for further inves- 
tigations are given in section 6. 


2 Classical IMM algorithm 


The IMM filter is one of the most used algorithm for 
tracking maneuvering targets and was developed origi- 
nally by Henk Blom in eighties [5, 6, 2]. The IMM filter 
is a recursive filter with a low complexity and has been 
proved very efficient in many real-data tracking appli- 
cations [4] and many extensions of IMM have been de- 
veloped since its original publication for dealing with 
multitarget-multisensor case, cluttered environments, 
etc, see [12] for a good survey of Multiple Models tech- 
niques. The classical IMM algorithm considers a hybrid 
Multiple Models (MM) system which obeys one of a fi- 
nite number r of dynamic models M;, i = 1,...,r and 
estimates the posterior mode probabilities from their 
prior probabilities and target measurements (Bayesian 
framework). Its specificity is that IMM mixes hypothe- 
ses with depth 1 only at the start of each cycle and thus 
has a low complexity of order O(r), while providing 
same performances as the more effective Generalized 
Pseudo-Bayesian estimator of order 2. We briefly recall 
the principle of classical IMM filter, see [3, 4] for more 
details with examples. A hybrid MM system is charac- 
terized by two state variables: 1) the base-state variable 
x(k) of dimension n, including the position, velocity, 
etc. of the target, and 2) a modal-state M;(k) belong- 
ing to a known finite set M, (k) = {Mi(k),i=1,...,r} 
of r possible dynamic models for the target during its 
motion. For simplicity of presentation, we consider only 
a fixed-structure IMM, i.e. M,(k) = Mr is invariant 
with time. Variable-structure IMM is possible and has 
been introduced by Xiao-Rong Li in [10, 11]. The hy- 
brid system is described by the equations! 


x(k) = F[M(k)|x(k — 1) + v[k — 1, M(k)] 





z(k) = H[M(k)|x(k) + w[k, M(k)| 


where M(k) is the mode in effect during the sampling 
period ending at time k belonging in M,. x(k) and 


1For simplicity, we assume here linear systems. 
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z(k) are the target state and observation vectors. The 
set of all available measurements up to k is denoted 
Z. F(M(k)] and H[M(k)] are known matrices de- 
pending on the dynamic model M(k). The statistics 
of the process and observation noises v[k — 1, M(k)] 
and w[k, M (k)] can differ from mode to mode. Usually 
one considers v[k — 1, M(k) = M;] ~ N(wv;,Q;) and 
w(k, M(k) = M;] ~ N(w,,R,;) with known covariance 
matrices Q; and R; respectively. The Mode jump pro- 
cess is modeled as a Makov chain with known a priori 
probabilities P{M(0) = Mj} = uw; (k = 0) and known 
transition probabilities P{M(k) = M,|M(k — 1) = 
Mi} = Tij. A cycle of the classical IMM algorithm 
(k — 1) > k consists in the following steps: 

e Step 0 (Initialization at k = 0): Definition of dy- 
namic and observation matrices, choice of process and 
observation noise levels, sampling period, initialization 
of the filters adapted to each mode, choice of the prior 
mode probabilities P; and the transition probability 
matrix P; £ [Tij = P{M;(k)|M;(k = 1)} assumed 
known and time-invariant. 

e Step 1 (Interaction-mixing (j = 1,...,r)): Mixing 
of the previous cycle mode-conditioned state estimates 
%;(k — 1|k — 1) and covariance, using the mixing prob- 
abilities u;;(k — 1|k — 1), to initialize the current cycle 
of each mode-conditioned filter $9 (k — 1|k — 1). This is 
done by 











29(k—1|k—1) = So payj(R— 1k) Ri(k—1k—1) (1) 


PI- 1k —1) = So pay (k 1k- 1){Pi(k- 1k- 1) 





[:(k 
[R:(k 


1k 
1|k —1) 


1) — $ (k 


RS (k 


1k 
1k 


1)}- 
D} (2) 
where the elements ujj (k—1|k— 1) of the mixing prob- 


ability (vertical) vector fy_1),~1(-|Mj(k)) = [May (& — 
1k —1),i=1,...r]’ are calculated by 





Hig (k — 1k — 1) = P{M;(k — 1)|M; (k), Z*~} 


_ Tijpilk— 1) 
BEO i 


with 
u7 (k) = P{Mj(k)|Z**} = do mgHi(k -1) (4 


The equation (4) can be written more concisely as: 


(5) 


where P; = [7;;] and pp—1(.) represents the (vertical) 
vector of prior probability of modes, i.e. 


My —1(.) = [P(Mi(k-1)|ZP7Y' = [wa (k—1) «ae (kD)! 


My (.) = Pi: Mgal) 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


and u; (.) represents the (vertical) vector of predicted 
prior probability of modes 
[POEZ = [yz (k) ae E) 

e Step 2 (Mode conditioned filter): From prior 
mixed statistics $$ (k — 1|k— 1) and P}(k—1|k—1) and 
the target measurement z(k), one calculates x,(k|k) 
and P,(k|k) for each possible mode in effect (r filters 
running in parallel) by a specific filter matched to mode 
M,, typically a Kalman filter if the dynamic and obser- 
vation system are linear, or Extended Kalman Filter 
(EKF) to deal with linear or non linear equations, or 
any other sophisticated filters if necessary for dealing 
for example with miss-detections and false alarms [3]. 
The likelihood A;(k) of the filter j is assumed to be 
Gaussian with 


My, (.) = 


—— 
(27)”=/ VIS) 


—iz Zz. 
Aj(k) = 42, (k)S7'(k)ž; (k) (6) 


exp 


where ž;(k) = z(k) — 2; (k|k — 1) is the innovation and 
S;(k) is the covariance of the innovation provided by 
the filter j. 

e Step 3 (Mode probability update): The probability 
uj(k) of each mode j for j =1,...,r is calculated by 


XAW 


uj(k) = P{M;(k)|Z*} = 


(7) 
e Step 4 (Global estimation for output purpose): 


The global estimate x(k|k) and the covariance of esti- 
mation error P(k|k) are given by: 


Dwi 
Dwi 


z5 (klk) — Ŝ(k|k)] - [& 


%(k|k) = Jå; (klk) (8) 


P(k|k) = ){P (klk) 


j(k|k) — X(k|k))"} (9) 
3 Belief-based IMM algorithm 


In 2009, Nassreddine et al. have proposed in [13] 
an extension of classical IMM filter in the framework 
of Dempster-Shafer Theory (DST) [14] for dealing with 
an unknown and variant motion models. The idea was 
to select a set of candidate models*, and then esti- 
mate a current basic belief assignment (bba) defined 
on the power-set of this set of models based on the fu- 
sion of bba’s built from measurement likelihoods with 
the predicted bba of the models using Smets’ rule? de- 
noted ©. From the result of Smets’ fusion, the mixed 


2 Corresponding to the so-called frame of discernment and usu- 
ally denoted © in DST. 

3Smet’s rule is nothing but the non normalized Dempster’s 
rule of combination, i.e. the conjunctive rule. 
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state of classical IMM filter is replaced with the pignis- 
tic averaging of the mode-conditioned state estimates. 
This new extension of IMM filter was called BIMM 
(Belief-based IMM) since it uses belief function theory 
to represent the uncertainty in the switches between 
the modes. This section presents succinctly the prin- 
ciple of the BIMM filter. We justify also our motiva- 
tion for developing a new Belief-based IMM algorithm. 
The steps of BIMM are actually very close to the steps 
of classical IMM, except that predicted and updated 
mode probabilities are estimated from pignistic proba- 
bilities derived from a basic belief assignment updated 
with the conjunctive rule of combination. The main 
changes of BIMM concern the Step 1 and the Step 3 
of IMM algorithm. The frame of discernment chosen 
in BIMM coincides with the set of possible models, i.e. 
O(k) = M,(k) = {M;(k),i = 1,...,r}. Instead of 
computing recursively the mixed j1;);(.) and updated 
uj(.) probabilities with eqs. (3) and (4) as done with 
the classical IMM, one deals with bba’s defined on the 
power-set 2° of the frame of discernment. Mathemati- 
cally, a normal bba m(.) is defined* as a mapping from 

© ++ [0,1] such that m(0) = 0 and © 4250 m(A) = 1. 
A is a focal element of m(.) if m(A) > 0. Any discrete 
probability measure can be interpreted as a special be- 
lief function, called Bayesian belief [14] whose focal el- 
ements are singletons of 2°. Any belief function with 
a bba m(.) can be approximated into subjective prob- 
ability measure thanks to the pignistic transformation 
[17] defined for all M; € O(k) by 


> i 


AE29|ANM;i=M; 


BetP{M;} = (10) 


1—m(O) 
where |A| is the cardinality of A. 


The steps of BIMM proposed in [13] are®: 

e Step 0 (Initialization at k = 0): Definition of dy- 
namic and observation matrices, choice of process and 
observation noise levels, sampling period, initialization 
of the filters adapted to each mode. The prior proba- 
bilities of modes {P; = P{M (0) = M;},j =1,...,r} 
used in IMM, are replaced® by the vacuous belief as- 
signment m(O(k = 0) = Mı UM2U...UM,) =1. The 
probability transition matrix P; = [7] is replaced by 
a bba transition matrix’ M; £ [m;j] having a very sim- 
ple structure defined by the r implication rules: ” R;: if 
M(k) = M,(k) then M(k+1) = Mi(k+1)” with known 
belief coefficients 6; € [0,1] for i = 1,2,...,r with 
M,U...UM,(k + 1)|M;(k)). 


4We use boldface letters to denote vectors or matrices. 

5We use a more classical notation generally adopted in the 
tracking community. 

6Note that this initialization can also be done by taking 
m(M;(k = 0) = Pj as well if one considers that prior proba- 
bilities of modes is accurate enough. 

“Called switching mass function in [13]. 
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e Step 1 (Interaction-mixing): The mixing probability 
hijj(k — 1|k — 1) are calculated as follows: 


1. The derivation of probabilities vector pp (.) = 
[ui (k) ... u; (k) in classical IMM is replaced by 
the derivation of the predicted bba m; (.) given by 


(11) 


2. The derivation of probabilities j1;);(k — 1|k — 1) £ 
P{Mi(k — 1)|Mj(k),Z*-1} is replaced by the 
derivation of bba m,_1\,-1(.) thanks to the Gen- 
eralized Bayesian Theorem (GBT) [18]. More pre- 
cisely, 


mz (.) = Mz: mg_1(.) 


My—1]K—1(-|Mj(k)) = 


[omg P*29 (LMi(& = DIIM (AOE 
(12) 


where f O(k — 1) x O(k) is the ballooning exten- 
sion [18] of the bba on the Cartesian product frame 
O(k — 1) x O(k), and where | O(k — 1) represents 
the marginalization operation of the bba on the 
frame O(k —1). See [18], for details and examples. 


3. The derivation of the mixing probability ,);(k — 
1|k — 1) = P{M,(k — 1)|M;(k), Z*-+} of classical 
IMM is replaced by the pignistic probability drawn 
from my—1)4—1(-|Mj(k)), that is: 


[i(k — 1]k — 1) = BetP{M;(k — 1)|Mj(k), Zk-1} 


where BetP{.} is calculated with the transforma- 
tion (10) using m,_4),~1(.|Mj(k)) given by (12). 


%0(k —1|k —1) and P9(k —1|k —1) are calculated as in 
IMM Step 1. 

e Step 2: Same as IMM Step 2. 

e Step 3 (Mode bba update): The updated bba m,(.) 
of modes is computed from the conjunctive combination 
of the predicted bba m}; (.) given in (11) with observed 
bba’ mg jC), j =1,2,...7 by 


my(.) = [mz10...Om,,,Om,_,](.) (13) 
where the observed bba’s m,,;(.) for j = 1,...,r are 
given? by [13]: 

meg(Mj(k)) =0 
mx,j(Mj(k)) = aj (1 — RA;(k)) (14) 
mrj(O(k)) = 1—ay(1 — RA;(R)) 


a; is a discounting coefficient associated with the like- 
lihood of the mode M;(k) and R is a normalization 
constant. 

8We mean that the bba m,_,;(.) is built from the likelihood 
Aj(k) which depends on the mode M;(k) and on the observation 
available z(k). 

9This is Appriou’s model no. 1 in [1]. 
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e Step 4 (Global estimation for output purpose): 
The global estimate x(k|k) and the covariance of 
estimation error P(k|k) are given as in step 4 of clas- 
sical IMM by taking u;(k) = BetP{M,;(k)|Z*} where 
BetP{M,(k)|Z"} is the pignistic probability that the 
mode M; is effective at time k. BetP{Mj;(k)|Z*} 
is computed from the updated bba m;(.) given by (13). 


A mistake in Step 1 of BIMM filter: The afore- 
mentioned Step 1 of BIMM algorithm described with an 
example in [13] is clearly incorrect because the deriva- 
tion of the predicted bba mz (.) by (5) is wrong be- 
cause the sum of masses of focal elements is not equal 
to one. It is easy to verify from example in [13] 
when considering only two models, when taking 3; = 
mM, (k)|Mi(k ca 1)) T 0.9, 1- By =0.1 = mMı(k) U 
Mı(k)|Mı(k — 1)) and 62 = m(Mo(k)|Mo(k — 1)) = 
taking the prior bba m,z-1(.) = [m(0) = 0 m(Mı (k — 
1)) = 0.45 m(Mə(k — 1)) = 0.20 m(Mı (k — 1)U M2(k — 
1)) = 0.35]. Applying the wrong formula (11), one gets 
precisely: 


1 0 0 0 0 0 0 
0 09 0 0.1 0.45] — | 0.4400 2 0.44 
0 0 0.89 0.11] |0.20 0.2165 0.21 
0 0 0 1 0.35 0.3500 0.35 
— ~r —— 
M; mp—1(.) m; (.) Result in [13] 


One can see that the sum of components of m, (.) 
equals 1.0065 !!! This mistake is not due to round- 
ing approximation of the result, but to a more serious 
mistake in the choice of the transition matrix M;. This 
mistake actually comes from the confusion in indices of 
the classical IMM transition matrix. It is easy to ver- 
ify that the correct transition matrix must be actually 
taken as the transpose of M;. Therefore, the correct 
derivation of m; (.) must be done by 


my (.) Ê M; -mei(.) (15) 


For the example 1 of [13], one will get correctly 


1 0 0 0 0 0 
0 0.9 O 0j {0.45} — |0.4050 
0 O 0.89 0j {0.20 0.1780 
O 0.1 0.11 1] [0.35 0.4170 
— aaa s/o 
M; m,—1(.) m; (.) 


Remarks on BIMM filter: The BIMM is based 
on two!? pillars: 1) the conjunctive rule of combina- 
tion, and 2) the pignistic transformation to approxi- 
mate a bba into a subjective probability measure be- 
cause. These two pillars are disputable because: 


10 Actually, Smets’ Generalized Bayesian Theorem (GBT) 


could be also considered as the third pillar of BIMM. 
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1. The efficiency of Smets’ rule for combining bba’s 
is very questionable in this belief-based extension 
of IMM because it has been already proved in [15], 
Vol. 3, and specially in sequential Target Type 
Tracking problem [7] that such rule doesn’t per- 
form well in general for mode change detection. 
Smets’ rule doesn’t respond to new information 
since very quickly all the mass of belief concen- 
trates on the empty set. See example in [15], Vol. 
3, Chap. 1, freely downloadable from the web and 
not reported here due to space limitation. 


2. The real interest and efficiency of the pignistic 
transformation is also disputable because there ex- 
ists other probabilistic transformations which per- 
form better than BetP in term of probabilistic in- 
formational content, in particular the DSmP trans- 
formation developed in [15], Vol. 3, Chap 1 & 3 
and also in [8]. 


3. The justification for the use of Appriou’s model no. 
1 in step 3 of BIMM is missing and probably other 
(and maybe better) models could be developed to 
derive the updated bba m;(.). This question has 
not been investigated in this paper and will be a 
source for future research. 


Interest of BIMM w.r.t. IMM: The potential ad- 
vantage of the belief-based IMM approach is to offer 
some robustness of the filter when replacing the strong 
constraint on the knowledge of probability of transitions 
Tij (usually based on ad-hoc assumptions on the mean 
sojourn time of the target in each mode) by a more flex- 
ible constraint on the transitions based on (very sim- 
ple and less specific) uncertain implication rules. With 
BIMM, one can also relax the knowledge of the prior 
probabilities of the modes by starting the tracking di- 
rectly with a vacuous belief prior of the modes. Of 
course, if one has good reasons to use a given prior 
of modes, this can be done easily in belief-based IMM 
approach which is also a nice features of such filter. 


4 PCR-BIMM algorithm 


To preserve the potential advantages of BIMM and 
to overcome its aforementionned problems, we propose 
to keep its general structure as a belief-based extension 
of classical IMM but we replace Smets’ rule by the more 
effective Proportional Conflict Redistribution rule no. 
5 (PCR5), or eventually the more simple PCR rule no. 
6 (PCR6), and to replace the pignistic transformation 
by the more effective DSmP transformation to estimate 
modes probabilities required in the IMM filter. We call 
this new algorithm, the PCR-BIMM filter. Before giv- 
ing the sketch of our PCR-BIMM filter, we just recall 
what are the PCR5 fusion rule and the DSmP trans- 
formation. All details, justifications with examples on 
PCR5 and DSmP can be found freely from the web in 
[15], Vols. 2 & 3 and will not be reported here. 
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4.1 PCR5 and PCR6 fusion rules 


In DSmT (Dezert-Smarandache Theory) framework, 
the Proportional Conflict Redistribution Rule no. 5 
(PCR5) is used generally to combine bba’s. PCR5 
transfers the conflicting mass only to the elements in- 
volved in the conflict and proportionally to their indi- 
vidual masses, so that the specificity of the informa- 
tion is entirely preserved in this fusion process. Let 
mı(.) and mo(.) be two independent! bba’s, then the 
PCR5 rule is defined as follows (see [15], Vol. 2 for 
full justification and examples): mpcrs(%) = 0 and 
VX € 2° \ {0} 


mpcrs(X) = 5 m4(X1)m2(X2)+ 
maX maX) m0m (Xa) 
Py TAC) ws gen Ci (16) 
X2NX=0 


where all denominators in (16) are different from zero. 
If a denominator is zero, that fraction is discarded. Ad- 
ditional properties of PCR5 can be found in [9]. Exten- 
sion of PCR5 for combining qualitative bba’s can be 
found in [15], Vol. 2 & 3. All propositions/sets are 
in a canonical form. A variant of PCR5, called PCR6 
has been proposed by Martin and Osswald in [15], Vol. 
2, for combining s > 2 sources. The general formu- 
las for PCR5 and PCR6 rules are given in [15], Vol. 2 
also. PCR6 coincides with PCR5 when one combines 
two sources. The difference between PCR5 and PCR6 
lies in the way the proportional conflict redistribution 
is done as soon as three or more sources are involved 
in the fusion. For example, let’s consider three sources 
with bba’s m1(.), me(.) and m3(.), AN B = 0 for the 
model of the frame O, and mı (A) = 0.6, m2(B) = 0.3, 
m3(B) = 0.1. With PCR5 the partial conflicting mass 
m,(A)m2(B)m3(B) = 0.6-0.3-0.1 = 0.018 is redis- 
tributed back to A and B only with respect to the 
following proportions respectively: gA RS = 0.01714 
and THORS = 0.00086 because the proportionalization 
requires 








rA ap ma (A)ma(B)ms(B) 
mı(A) — m2(B)m3(B) mı(A) + m2(B)ms(B) 
PCRS PCRS 0.018 
that is —4 = —8__ = _"__» 0.02857 
ee ae 0.03 0.6+0.03 


roe rhCR5 = 0.60 - 0.02857 ~ 0.01714 
rbhCR = 0.03 - 0.02857 ~ 0.00086 


With the PCR6 fusion rule, the partial conflicting mass 
m,(A)m2(B)m3(B) = 0.6-0.3-0.1 = 0.018 is redis- 
tributed back to A and B only with respect to the fol- 
lowing proportions respectively: gece = 0.0108 and 


l.e. each source provides its bba independently of the other 
sources. 
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zECR6 — 0.0072 because the PCR6 proportionalization 
is done as follows: 

















ehCR6 aR RS mi(A)me(B)ms(B) 

m (A) J mə(B) m3(B) E 

that is 

e aS a O 
06 03 01  06+03+01 ` 

thus 


z4 O6 = 0.6 - 0.018 = 0.0108 
zp = 0.3 - 0.018 = 0.0054 
zB? = 0.1 - 0.018 = 0.0018 


and therefore with PCR6, one gets finally the following 
redistributions to A and B: 


xRCR6 — 0.0108 
eBORS — gPCR6 + PORS — 0.0054 + 0.0018 = 0.0072 


From the implementation point of view, PCR6 is much 
more simple to implement than PCR5. For conve- 
nience, Matlab codes of PCR5 and PCR6 fusion rules 
can be found in [15, 16]. 


4.2 The DSmP transformation 


The DSmP probabilistic transformation is a seri- 
ous alternative to the classical pignistic transformation 
which allows to increase the probabilistic information 
content (PIC), i.e. to minimize the Shannon entropy, 
of the approximated subjective probability measure 
drawn from any bba. Justification and comparisons of 
DSmP(.) w.r.t. BetP(.) and to other transformations 
can be found in details in [8, 15], Vol. 3, Chap. 3. 
DSmP transformation is defined!? by DSmP.(0) = 0 
and VX € 2° \ {0} by 


XO m(Z) +€-C(X NY) 


ees 
C(Z)=1 
DSmP.(X) = Y PZ ni) 
yae >, mZ)+e-C(Y) 
ZCY 
C(Z)=1 
(17) 


where C(X NY) and C(Y) denote the cardinals of the 
sets X MY and Y respectively; € > 0 is a small number 
which allows to reach a highest PIC value of the ap- 
proximation of m(.) into a subjective probability mea- 
sure. Usually « = 0, but in some particular degen- 
erate cases, when the DSmP-<0(.) values cannot be 
derived, the DSmP.s9 values can however always be 
derived by choosing € as a very small positive number, 
say € = 1/1000 for example in order to be as close as 
we want to the highest value of the PIC. The smaller e, 
the better/bigger PIC value one gets. When € = 1 and 
when the masses of all elements Z having C(Z) = 1 are 
zero, DSmP-=1(.) = BetP(.). 

12Here we work on classical power-set, but DSmP can be de- 


fined also for working with other fusion spaces, hyper-power sets 
or super-power sets if necessary. 
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4.3 Sketch of PCR-BIMM 


We briefly summarize the five steps of our PCR- 
BIMM filter. 


m,(A) + m2(B) +m3(B) ¢ Step 0 (Initialization at k = 0): Same as Step 0 of 


BIMM. 
e Step 1 (Interaction-mixing): Same as Step 1 of 
BIMM except that the predicted bba m; (.) is com- 
puted by (15) instead of (11), that is 
mz (.) ê M; -m1 () (18) 
and the derivation of the mixing probability 1,);(k — 
1k — 1) = P{Mi(k — 1)|M;j(k),Z*-1} of classical 
IMM is replaced by the DSmP probability drawn from 
My—1)k-1(-|Mj(k)), that is: 


ay(k — 1k — 1) = DSmP.(Mi(k — 1)|M;(k),Z*~*) 


where DSmP,(.) is calculated with the transformation 
(17) using my_1\~-1(.|M;(k)) given by (12). 
e Step 2: Same as IMM Step 2. 
e Step 3 (Mode bba update): The updated bba m,(.) 
of modes is computed from the PCR5 (or eventually 
PCR6) rule, denoted ©, of the predicted bba mj;_,(.) 
given in (15) with bba’s m,,;(.), j = 1,2,...r by 
m,(.) = [m1 ®... PME, © mz ](-) (19) 
where the observed bba’s mx j(.) for j = 1,...,r are 
given as in BIMM by (14). 
e Step 4 (Global estimation for output purpose): The 
global estimate x(k|k) and the covariance of estimation 
error P(k|k) are given as in step 4 of classical IMM 
by taking u;(k) = DSmP.{M;(k)|Z*} computed from 
the updated bba m,(.) by (17). 


Remark: This preliminary version of PCR-BIMM is 
perfectible because it still shares several points with 
BIMM?. In particular, the Step 3 of PCR-BIMM cal- 
culates, as in BIMM, m,,;(.) with a model based on 
likelihoods A,;(k) whose strong justification is missing. 
Further investigations will be done to improve this step 
3, as well as the Step 1 to get better performances of 
PCR-BIMM (if possible) in a future research. 


5 Simulation results 


In this section, we present the application of the 
PCR-BIMM to a ground target tracking problem. We 
consider a vehicule localized in (1000m, 5000m) in the 
cartesian referential (X,Y). We simulate a ground sen- 
sor located in (0,0) which is able to detect the moving 
target in range p and azimut 0. The gaussian mea- 
surement noise is supposed to be white and centered 
with the covariances op = 20m and og = 0.008 rad. 
The sampling time is fixed to 2 seconds. For tracking 
the ground target we only consider two motion models. 


13Tn particular, the GBT is still used in Step 2 of PCR-BIMM. 
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Figure 1: True target trajectory and estimated trajec- 
tories. 


A constant velocity motion model called CV 1, with a 
small noise acy, = 1m.s~? and another constant ve- 
locity motion model called CV 2, with a bigger noise 
TCV, = 4m.s~? to palliate the target maneuver. The 
initial state for each IMM, BIMM" and PCR-BIMM is 
the true initial target state x(0). The transition Matrix 
P; is equal to : 


P, = | 0.95 0.05 | (20) 


0.05 0.95 


and the mass transition matrix M; for the BIMM and 
PCR-BIMM is same as in the paper [13]. The initial 
motion model mass is represented by the vacuous mass 
function. 

To compare the performances between the algorithms 
we used the root mean square error (RMSE) in loca- 
tion and velocity (figure 2) and the mean of the motion 
models probability obtained with 100 Monte-Carlo runs 
(figures 3, 4, 5). The first remark is, there is no signif- 
icant improvement by using the belief function in the 
IMM. In fact, the RMSE of the IMM, BIMM and PCR- 
BIMM are globally the same. However, we can observe 
a short difference of the PCR-BIMM error after the tar- 
get maneuvers between the time intervals [20,30] and 
[40,50]. This observation carries along the second re- 
mark: the motion model transition duration is longer 
with the IMM (figure 3) and BIMM (figure 4) than the 
PCR-BIMM (figure 5). Then with the taken parame- 
ters for this simulation, the PCR-BIMM appears to be 
a good and fast detector of the motion models transi- 
tion. However, its computed motion models probability 
is inferior to the probability obtained with the IMM and 
BIMM. More investigations need to be done to see if it 
is possible (and how) to improve PCR-BIMM in order 
to preserve both the good performance of the maneuver 


14Our BIMM implementations uses algorithm described in sec- 
tion 3 with (15) and additional normalization step m,(.) in (7) 
since otherwise the BIMM algorithm doesn’t work at all due to 
the problem mentioned in section 2. 
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detection and in the same time and get higher proba- 
bility when the target is moving in the same mode. 
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Figure 2: Root Mean Square Error. 
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6 Conclusions 


In this paper, we have examined in details the re- 
cent BIMM algorithm and have corrected a mistake 
in it, and also identified some of its limitations. To 
palliate the problems of BIMM algorithm, we have de- 
veloped a more efficient belief-based algorithm, called 
PCR-BIMM, based on the Proportional Conflict Re- 
distribution fusion rule and on the DSmP probabilistic 
transformation to replace the conjunctive rule and the 
pignistic transformation used in BIMM. The derivation 
of the predicted bba of modes done incorrectly in BIMM 
is also fixed in our PCR-BIMM filter. The perfomances 
of PCR-BIMM with respect to the (corrected) BIMM 
and to the classical IMM have been evaluated from a 
simple maneuvering target tracking scenario through 
Monte-Carlo simulations. The results obtained in this 
paper show the ability of the PCR-BIMM to track ma- 
neuvering targets and also to improve the maneuver 
detection. It is important to note that such PCR- 
BIMM filter can be considered as more robust than 
IMM since PCR-BIMM requires less specific prior in- 
formation than IMM. Nevertheless, PCR-BIMM pro- 
vides globally the same RMS estimation errors perfor- 
mances as those obtained with the classical IMM which 
requires more specific prior information. Application of 
PCR-BIMM for tracking multiple maneuvering ground 
targets in a battlefield surveillance context is under in- 
vestigation and results will be published in forthcoming 
papers. 
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Implementation of Approximations of Belief Functions 
for Fusion of ESM Reports within the DSm Framework 


Pascal Djiknavorian 
Pierre Valin 
Dominic Grenier 


Abstract - Electronic Support Measures consist of passive 
receivers which can identify emitters which, in turn, can 
be related to platforms that belong to 3 classes: Friend, 
Neutral, or Hostile. Decision makers prefer results 
presented in STANAG 1241 allegiance form, which adds 
2 new classes: Assumed Friend, and Suspect. Dezert- 
Smarandache (DSm) theory is particularly suited to this 
problem, since it allows for intersections between the 
original 3 classes. However, as we know, the DSm hybrid 
combination rule is highly complex to execute and 
requires high amounts of resources. We have applied and 
studied a Matlab implementation of Tessem's k-l-x, 
Lowrance’s Summarization and Simard’s approximation 
techniques in the DSm theory for the fusion of ESM 
reports. Results are presented showing that we can 
improve on the time of execution while maintaining or 
getting better rates of good decisions in some cases. 


Keywords: Dezert-Smarandache 
approximations, Belief functions. 


Theory, ESM, 


1 Introduction 


In terms of classification, the Dezert-Smarandache theory 
(DSmT) can become quite useful, especially for the direct 
resolution of classification for cases of hierarchical classes 
structures. For instance, we have the case of the allegiance 
classification structure suggested by STANAG 1241 where 
a structure of five classes (3 main classes and 2 derived 
classes) is required. The DSmT is able to output to any of 
those classes without modifications to its fusion process. 


However, this example is still a simple one and both DSmT 
theories, with or without approximation, can solve it quite 
easily, which wouldn’t be the case for classification 
problems of higher dimension. By dimension we mean the 
cardinal of the frame of discernment. In fact, the DSmT can 
become highly complex and computationally prohibitive as 
soon as we reach a dimension of 6. That is a classification 
of a problem having six main classes and up to, in the worst 
case scenario, a total of 7,828,353 possible derived classes. 

Various avenues of research have been tried to avoid or 
address this complexity problem [10, 13, 18]. However, 
even just counting the number of possible classes is still an 


Originally published as Djiknavorian P., Valin P., Grenier D., Implementation 


of Approximations of Belief Functions for Fusion of ESM Reports within the 
DSm Framework, in Proc. of Fusion 2010, Edinburgh, Scotland, UK, 26-29 
July 2010, and reprinted with permission. 


active problem in mathematics known as the Dedekind 
problem, or the problem of counting antichains [9, 18]. 


In this paper, we study the use of an approximation 
technique to restrain the staggering amount of data that the 
DSmT can generate in its fusion process. More specifically 
we have chosen Tessem’s klx approximation technique [4], 
Lowrance’s Summarization [19], Simard’s and al technique 
[3, 7, 8] and used them into the DSmT with the DSm hybrid 
combination rule (DSmH). We have also experimented with 
the fusion process while using the approximation technique 
and compared it to the case without an approximation 
technique to analyze how it affects the quality of the 
decision process. More specifically, we will compare the 
good decision rate in the two cases, with and without the 
use of approximation. 


1.1 Realistic Case Study 


Electronic Support Measures (ESM) consist of passive 
receivers which can identify emitters coming from a small 
bearing angle, which, in turn, can be related to platforms 
that belong to 3 classes: either Friend (F), Neutral (N), or 
Hostile (H). Decision makers prefer results presented in 
STANAG 1241 allegiance form, which adds 2 classes: 
Assumed Friend (AF), and Suspect (S). 


The DSm theory is particularly suited to this problem, since 
it allows for intersections between the original 3 classes of 
allegiance. In this way an intersection of Friend and Neutral 
can lead to an Assumed Friend, and an intersection of 
Hostile and Neutral can lead to a Suspect. This structure of 
allegiances will be referred to as STANAG allegiance [11]. 


Figure 1 displays a visual representation of a possible 
interpretation of STANAG allegiance in DSmT. We can see 
that even though the input consists only of three classes, we 
are able to give an output into five classes. For example, 
here we have the class ‘Suspect’, which could be the result 
obtained after fusing ‘Hostile’ with ‘Neutral’. We also have 
the class ‘Assumed Friend’, which could be the result 
obtained after fusing ‘Friend’ with ‘Neutral’. Note that this 
case example has the intersection FOH = Ø, the null set, 
which is a constraint in DSm, leading to the use of its 
hybrid rule. This case example would be relevant for peace- 
keeping missions where Hostile and Friendly forces aren’t 
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likely to be close one to another. We will be working on 
that case, with FOH = ©. 





Figure 1. Venn diagram for the STANAG allegiances. 


2  Dezert-Smarandache Theory 


The DSm theory uses the language of masses assigned to 
each declaration from a sensor (in our case, the ESM 
sensor). In DSm theory, all unions and intersections are 
allowed for a declaration. For our case of cardinality 3, © = 
10,02, 6 3}, with |O| =3, D® is still of manageable size, 
namely has a cardinality of 19 [10]. In DSm theory, a 
constraint like the one that was imposed by Figure 1, 
namely that FOH = 0 ,N 0 ; = Ø is treated by the DSm 
hybrid combination rule (DSmH) below: 


m(A) = YA) [ S4) + SoA) + S3(A) ] (1) 


The reader is referred to a series of books [10, 13, 17] on 
DSm theory for lengthy descriptions of the meaning of this 
formula (note that the function @ is not to be confused with 
the empty set). A three-step approach was proposed in [12], 
which is used here. The incoming sensor reports are either: 
Friend (F= 9 ;), Neutral (N= 9 3) or Hostile (H= 9 3), Figure 
1 has the interpretation of the five classes: 


Friend = {0 ;— 8,765} (2) 
Hostile = {0 ;— 63-0} (3) 
Assumed Friend = {0,;78,} (4) 
Suspect = {0703} (5) 
Neutral = {0 2— 01N — 03-82} (6) 


As in [15], we call STANAG-probability the pignistic 
probability assigned to the five classes shown by equations 
(2) to (6). We use the general pignistic transform, as shown 
by [10] or equation (7), to obtain the probability values of 
the sets used in those equations. 


Cu (XN A)m(X) 
Cm (X) (7) 


P{A}= 5 


XeED? 


Where C,(A), is the DSm cardinal of a set A. It accounts 
for the total number of partitions. Each of these partitions 
possesses a numeric weight equal to one. That weight, 
identical for each part makes them all equal. The DSm 
cardinal is used in the generalized pignistic transformation 
equation to redistribute the masse of a set A among all its 
partitions B such that B is included or equal to A. 


3 Approximation technique 


3.1 K-I-x approximation 
The k-I-x approximation technique developed by Tessem 
[4] is designed to approximate Basic Probability 


Assignment (BPA) or mass function in Dempster-Shafer 
Theory (DST). Since DSm theory works directly with 
BPAs, applying the k-l-x approximation technique to the 
DSmH is quite straightforward and can be done without any 
changes. 


This algorithm for approximation of BPAs involves three 
parameters: k the minimum number of focal elements to be 
kept, 1 the maximum number of focal elements to be kept 
and x the maximum threshold on the sum of the lost 
masses. It can be summarized as follows: 


1. Select the k focal elements with highest masses; 


2; While the sum of their masses is less than 1-x, 
and while their number is less than 1, add the next 
focal element with highest mass. 


3.2 Simard’s and al. approximation 


This truncation scheme [3, 7, 8] has had many minor 
variations over time. Similarly to k-l-x approximation, it 
was conceived to approximate BPA or mass function in 
DST. And as in k-l-x, we were able to transfered it to the 
DSm framework. Variants exist but all focus on 
preferentially keeping fused propositions with the smallest 
lengths (lowest cardinality) after passing 2 thresholding 
steps. The rule therefore involves 3 parameters: BPAmax, 
BPAmin and Nmax. It retains fused propositions according 
to the following rules: 


1. All fused propositions with BPA > BPAmax are kept 
(thresholding step 1) 

2. All fused propositions with BPA < BPAmin are 
discarded (thresholding step 2) 
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3. Ifthe number of retained propositions in step 1 is 
smaller than Nmax retain by decreasing BPA, 
propositions of length 1, then if the number of 
retained propositions is smaller than Nmax, retain 
by decreasing BPA, propositions of length 2, and 
so on for 3... 

4. If the number of retained propositions is still 
smaller than Nma, retain propositions by 
decreasing BPA regardless of length. 


3.3 Lowrance’s approximation 


Similar to the k-I-x procedure, the summarization method 
[19] (inspired by the summarization operation described in 
Bauer’s research [5]) leaves the best-valued focal elements 
of the mass function under consideration unchanged. The 
numerical values of the remaining focal elements are 
accumulated and assigned to the set-theoretic union of the 
corresponding subsets of ©. Here again, the technique was 
conceived to approximate BPA or mass function in DST, 
and we were able to transfer it to the DSm framework. 


3.4 


The information coming from the sensor is a simple belief 
function giving a mass to an allegiance and the remaining 
mass to ignorance. The combination itself combines two 
belief functions, one is the information from the sensor at 
time ¢, the other contains past information within 
combination result from time t+/. The fusion process is 
realized dynamically. Since the information to combine 
from the sensor is a simple belief function the 
approximation is applied on the result of the combination. 


Implementation of approximations 


4 A typical simulation scenario 


The pre-requisites that a typical scenario must address are: 
(1) to be able to adequately represent the known ground 
truth, (2) to contain sufficient countermeasures (or miss- 
associations) to be realistic and to test the robustness of the 
theories, (3) to only provide partial knowledge about the 
ESM sensor declaration, which therefore contains 
uncertainty, (4) to be able to show stability under 
countermeasures, yet (5) to be able to switch allegiance 
when the ground truth does so. 


The following scenario parameters have therefore been 
chosen accordingly: (1) ground truth is FRIEND for the 
first 50 iterations of the scenario and HOSTILE for the last 
50, (2) the number of correct associations is 80%, 
corresponding to countermeasures appearing 20% of the 
time, in a randomly selected sequence, (3) the ESM 
declaration has a mass (confidence value in Bayesian terms) 
of 0.8, with the rest of the mass being assigned to the 
ignorance (the full set of elements, namely ©). 


This scenario will be the one addressed in the next section, 
while a Monte-Carlo study is described in the subsequent 


sections. Each Monte-Carlo run corresponds to a different 
realization using the above scenario parameters, but with a 
different random seed. The chosen scenario is depicted in 
Figure 2. 
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Figure 2. Chosen scenario. 


Roughly 80% of the time the ESM declares the correct 
allegiance according to ground truth, and the remaining 
20% is roughly equally split between the other two 
allegiances. Note that these percentages of occurrences are 
from a statistical point of view only, so that in the long run 
a large amount of randomly generated scenarios would 
amount to these ratios. There is an allegiance switch at the 
50th time index, and the selected randomly selected seed in 
the above generated scenario generates a rather unusual 
sequence of 4 false Friend declarations starting at time 
index 82 (when actually Hostile is the ground truth). 


4.1 


Before presenting the results, it should be noted that the 
original form of the DSmH tends to accumulates masses to 
intersections as is the case for any rule based on 
conjunction [14]. An ad hoc solution exists [3, 7, 8], and 
consists in renormalizing after each fusion step by giving a 
value to the complete ignorance which can never be below 
a certain factor (chosen here to be 0.04 as research in [14] 
shows that this value is appropriate for this case while being 
high enough to avoid the accumulation but still low enough 
not to interfere with the combination’s performances). That 
solution was originally developed to the well-known 
problem of DST combination, which tends to be overly 
optimistic, which in turn prevents it to react quickly to 
changes of allegiances. For more on the behavior of the 
DSmH on similar cases the reader is referred to [14, 15, 
16], as we are focused on exploring the effect of 
approximations on DSm here. 


Results for the simulated scenario 


Since the whole idea behind using DSm was to present the 
results to the decision maker in the STANAG allegiance 
format, the result of Figure 3 would be used. For the DSmH 
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[10], it was suggested to use the Generalized Pignistic 
Probability, which is based on the pignistic transformation 
[6, 10], in order to make a decision on a singleton belonging 
to the input ESM-allegiance. 
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Figure 3. DSmH result for the chosen scenario. 


The decision maker would clearly be informed that miss- 
associations have occurred, since Assumed Friend 
dominates for the first 50 time indices and Suspect for the 
latter 50. The Friend declarations starting at time index 82 
cause confusion, as it should. The change in allegiance at 
time index 50 is detected quickly. What is even more 
important is that F and AF are clearly preferred for the first 
50 time indexes and S and H for the last 50, as they should. 
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Figure 4. Approximated DSmH result for the same scenario 
with k-l-x = (5, 6, 0.2) 


We can gather from Figure 4 and Figure 5 that the DSmH 
and the approximated DSmH have very similar behaviors. 
In fact, one has to look at the figures very closely to 
perceive the differences. We can see that in the first half of 
the approximated version, the assumed friend allegiance is 
slightly favored to the friend allegiance. Near the end of the 


scenario the hostile allegiance is favored to the suspect 
allegiance. However, in both cases, even if the smallness of 
the change could possibly affect our decision, the 
STANAG-probability still seems to stay within the same 
type of allegiance in the sense that a friend and a target of 
assumed friend allegiance would both inspire a friendly 
response on our part. The same can be said for a target of 
suspect or hostile allegiance that would both inspire a 
hostile or defensive response on our part. In short, we can 
easily proceed with the approximation and still be able to 
make the same decision the same way. 
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Figure 5. Approximated DSmH result for the same scenario 
with k-l-x = (3, 6, 0.2) 


4.2 Effects of varying the k-l-x parameters 


We’ve realized the scenario for various values of k-l-x for k 
€ [3, 10], 1 Æ [6, 12] and x €[0.2, 0.4]. For the cases where 
we had k=8, no changes in | and x had impact, and 
compared to the DSmH, we’ve only noticed a very small 
variation at the start and end of the simulated scenario. For 
the cases where we had k=6, no changes in | and x had 
impact and compared to DSmH, there was only very little 
variation in value throughout the scenario. The same is true 
for the cases with k=5, with the Figure 4 showing the 
results for that case. The amplitude of the variation between 
DSmH and the approximated version continues to increase 
as the k value diminishes. 


We finally begin to notice small changes with x=0.2 as 
opposed to 0.3 or 0.4 when we reach k=4. However, the 
impact of having x at 0.2 is small and contained at the start 
of the scenario, where it gives more weight to the suspect 
class at the expense of the hostile class. For the cases with 
k=3, the impact of the change on x going to 0.2 was more 
significant and lasted throughout most of the scenario’s 
duration. Also, while for cases of k € [4, 8] the behavior of 
the curves were all very similar one to another, when we 
reach k=3, we observe a partial loss of smoothness, hence a 
more reactive behavior toward countermeasures and 
allegiance change. Figure 5 shows the case of the simulated 
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scenario for an approximated DSmH with klx = (3, 6, 0.2). 
Note that in all our experimentations for our chosen 
scenario the | parameter never had any visible impact. 


5 Monte-Carlo 
approximation 


Simulations with k-I-x 


Although a special case such as the one described in the 
previous section offers valuable insight, one might question 
if the conclusions from that one scenario pass the test of 
multiple Monte-Carlo scenarios. This question is answered 
in this section. 


In order to expend the parameter space, we have realized 
the simulations of the current section to 80 and 90% for the 
ESM certainty, and with an ESM confidence at 80% and an 
ignorance threshold at 0.04 as before. The number of 
Monte-Carlo runs was set to 100. The randomly generated 
ESM stream of reports used for both the DSmH and the 
approximated DSmH are all the same so that we can freely 
compare the effects of the use of the approximation, and the 
impact of the variation of its parameters. 


As for the choice of a the graphical display to highlight the 
results of our simulations, we went with the rate of good 
decisions, where a good decision is as we have mentioned 
earlier, when we conclude to be friendly toward a friendly 
behaving target, when the ground truth is of class friend. A 
friendly-behaving target is a target that is concluded to be a 
friend or an assumed friend. We also have a good decision 
when we conclude to be hostile toward a hostile behaving 
target, when the ground truth is of class hostile. A hostile- 
behaving target is a target that is concluded to be a hostile 
or a suspect. A decision is made by taking the set of 
maximum STANAG-probability. 


5.1 


Simulations were done on a computer with a Phenom II 955 
processor with 8 GB of memory. We should keep in mind 
that it is the relative time of execution which is important 
here. For figures 7 to 11, the simulations had a value of 
80% for the ESM certainty and the value of the x parameter 
was maintained at 0.2 since changing it had no impact on 
good decision rate. 


Effects of varying the k-I-x parameters 


Figure 7 and Figure 8 show us the effect of the 
approximation from the good decision rate point of view 
when compared with the DSmH case from Figure 6. Like 
for the typical simulated scenario from previous section, | 
had no visible impact, and x had a limited impact only as 
the k parameter went below 4. As for the k parameter, it 
started having an impact when we reached 6, where the 
impact was on only three iterations. As the k parameter 
reaches 5, a very slight positive impact throughout the 
whole simulation can be seen. As for k=4 and k=3, we have 
a slight deterioration of the good decision rate but it is still 
very small and rather insignificant considering the gain in 
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time execution as Figure 10 shows us. For the cases with an 
ESM confidence at 90%, all the approximated results, have 
no significant impact on the good decision rate, except with 
klx = (3, 8, 0.2) where we had minimal impact. 
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Figure 7. Approximated DSmH result with k-l-x = (5, 8, 
0.2) for the same Monte-Carlo simulation. 
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Figure 8. Approximated DSmH result with k-l-x=(3,8,0.2) 
for the same Monte-Carlo simulation. 
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We have the time of execution versus k and | parameters 
from the klx approximation technique on Figure 9 and 
Figure 10. Specifically, Figure 9 has the curve of the time 
of execution of the combination and approximation process 
only. The x-y plane, valued at 325.97 seconds on Figure 9 
indicates the time from which the approximation process 
provides a higher gain in time than the time it consumes. It 
is the time of execution of the DSmH_ without 
approximation. 


We can see that the k parameter has to reach 5 before we 
start seeing an improvement. Before that value, the 
approximation takes more time to execute than it helps us 
gain. We can achieve a 30% improvement on time of 
execution when we reach k = 3. The parameter | has no 
impact on time. The absence of impact of the | parameter is 
suspected to be caused by the fact that this simulated 
scenario case uses simple support functions as inputs. 
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Figure 9. Execution time for the combination and 
approximation processes. 
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Figure 10. Execution time for the whole simulation. 


In Figure 10, we have the curve of the time of execution for 
the whole simulation which, on top of the combination and 
approximation processes, includes the generalized pignistic 
transformation (GPT) which is used in the decision process. 
Above 95% of the extra time of execution, when compared 
to figure 10, is composed of the GPT. 


In Figure 10, the x-y plane, representing the time of 
execution of the simulation without approximation, is 
valued at 1767.6 seconds. We can see that we can have a 
50% reduction in time of execution when we reach k=3 and 
that | has no impact. As we compare Figure 9 and Figure 
10, we see that the GPT is the step that benefits the most 
from the approximation process. 


6 Monte-Carlo simulations using various 
approximation rules 


In order to expend the analysis furthermore, we have 
realized the simulations of the current section with Monte- 
Carlo runs set to 1000. Also, we’ve expended the analysis 
to Simard’s summarization, and Simard’s truncation 
techniques with the same stream of reports to fuse. Hence, 
both the DSmH and the approximated DSmH will have the 
same dataset so that we can freely compare the effects of 
the use of the approximation, and the impact of the 
variation of its parameters. 


Figure 11, which shows results using Lowrance’s 
approximation technique lets us see the inability of the 
technique to get better good decision rates than the non 
approximated combination. The following figures shows 
that k-l-x, and Simard's Truncation are both able to get, 
depending on the chosen parameters, better results of good 
decision rates, than the scenario without approximation. 
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Figure 11. DSmH using Lowrance’s apx. (3/5/8/10). 
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Figure 15. DSmH using Simard’s apx. (0.7—0.1—3/5/8/10). 


About the mean time of execution of the combination and 
approximation step for realistic scenario, we have found 
that for a parameter 'K' below, or equal, to 5, we were able 
to execute faster than without approximation. And when 
looking at previous figures, we see that, too low (K~3), the 
—e— 050.045 approximation isn't as good as without approximation, and 
—— 0.5-0.04-8 that at a value of 5, we were always at higher good decision 
— rates than the case without approximation. 


no apx 
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So not only we have found a case executing faster than 
without approximation, but we've also found ourselves a 
case where it performs better in terms of good decision rate. 
That is for approximation techniques different from 
Lowrance's, and limited, until proven differently, to this 
case, and for DSmH. 
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. i i Figure 16. Combination and approximation execution times 
Figure 14. DSmH using Simard’s apx. (0.7—0.04-3/5/8/10). 
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7 Conclusions 


The previous sections display the behavior for different 
cases of klx approximation on the same simulated ESM 
data (see Figure 6, Figure 7, and Figure 8). It also shows the 
time of execution of each of those simulations. From those 
results we can conclude that we can successfully attain the 
same good decision rate with DSmH as with an 
approximated DSmH for the chosen scenario, while 
achieving lower times of execution including the time to 
approximate when we reach a certain level of 
approximation. Those results are confirmed by the 
experimentation done on another simulated dataset lasting 
for 1000 Monte-Carlo runs. (see Figure 12) 


We’ve also explored the behavior of Lowrance’s 
summarization approximation and Simard’s truncation 
methods while using the same dataset also on a thousand 
Monte-Carlo runs. From what we can see on Figure 11, the 
summarization is able, with the careful choice of its 
parameter, to reach good decision rate of the combination 
rule without approximation, however, it seems to be rarely 
able to do better and can do much worst. Simard’s 
truncation method (see Figure 13, Figure 14 and Figure 15) 
on the other hand is able to get around 5% better good 
decision rates, depending on the choice of the 
approximation parameters. It can also get the same rates or 
a little less than the combination rule without 
approximation. 


When considering results of time of execution as shown on 
Figure 16 we gather that, while being able to execute faster 
than the combination rule without approximation, we can 
get better decision rates. The ‘K’ parameter value of 
approximation of each rule, when at 3 or 5, gave us highest 
decision rates for Simard’s truncation method or k-l-x 
approximation technique. Note that some times, parameter 
K had to be set at 3, other times at 5, depending on chosen 
technique and the other parameters, to reach highest 
decision rate. 


Future work considered includes the exploration of the use 
of Bauer’s D1 approximation [5] in DSmT. Even if it adds 
to the number of operations and in the complexity of the 
system, it would be interesting to see if the gain acquired by 
approximating is sufficient to counter the increase in 
complexity. We are also interested to see if it is able to give 
even better good decision rates than the other methods of 
approximation. 


References 


[1] A.P. Dempster, Upper and lower probabilities induced by a 
multivalued mapping, Ann. Math. Statist. 38, 1967. pp. 325-339. 


[2] G. Shafer, A Mathematical Theory of Evidence, Princeton Univ. 
Press, Princeton, NJ, 1976. 

[3] M.A. Simard, P. Valin and E. Shahbazian Fusion of ESM, Radar, IFF 
and other Attribute Information for Target Identity Estimation and a 
Potential Application to the Canadian Patrol Frigate, AGARD 66th 
Symposium on Challenge of Future EW System Design, 18-21 
October 1993, Ankara (Turkey), AGARD-CP-546, pp. 14.1-14.18. 

[4] B. Tessem, Approximations for efficient computation in the theory of 
evidence, Artificial Intelligence, vol. 61, pp. 315-329, June 1993. 

[5] M. Bauer, Approximation Algorithms and Decision Making in the 
Dempster-Shafer Theory of Evidence-An Empirical study, 
International Journal of Approximate Reasoning, vol. 17, no. 2-3, 
pp. 217-237, 1997. 

[6] Ph. Smets, Data Fusion in the Transferable Belief Model, 
Proceedings of the 3rd International Conference on Information 
Fusion, Fusion 2000, Paris, July 10-13, 2000, pp PS21-PS33. 

[7] D. Boily, and P. Valin, Truncated Dempster-Shafer Optimization and 
Benchmarking, in Sensor Fusion: Architectures, Algorithms, and 
Applications IV, SPIE Aerosense 2000, Orlando, Florida, April 24- 
28 2000, Vol. 4051, pp. 237-246. 

[8] D. Boily, and P. Valin, Optimization and Benchmarking of Truncated 
Dempster-Shafer for Airborne Surveillance, NATO Advanced Study 
Institute on Multisensor and Sensor Data Fusion, Pitlochry, 
Scotland, United Kingdom, June 25 — July 7 2000. Kluwer 
Academic Publishers, NATO Science Series, II. Mathematics 
Physics and Chemistry — Vol. 70, pp. 617-624. 

[9] R. Fidytek, A.W. Mostowski, R. Somla and A. Szepietowski. 
Algorithms counting monotone Boolean functions, Information 
Processing Letters, Vol. 79, Issue 5, pp. 203-209, 15 September 
2001. 

[10] F. Smarandache, J. Dezert, editors. Advances and Applications of 
DSmT for Information Fusion, vol. 1, American Research Press, 
2004. 

[11] STANAG 1241 (2005). NATO Standard Identity Description 
Structure for Tactical Use, North Atlantic Treaty Organization, 
April 2005. 

[12] P. Dyiknavorian, and D. Grenier, Reducing DSmT hybrid rule 
complexity through optimisation of the calculation algorithm, in 
Advances and Applications of DSmT for Information Fusion, 
Collected Works edited by F. Smarandache, J. Dezert,, Volume 2, 
American Research Press, 2006. 

[13] F. Smarandache, J. Dezert, editors. Advances and Applications of 

DSmT for Information Fusion, vol. 2, ARP, 2006. 

[14] P. Dyiknavorian, Fusion d'informations dans un cadre de 

raisonnement de Dezert-Smarandache appliquée sur des rapports de 

capteurs ESM sous le STANAG 1241, Mémoire de maîtrise, 

Université Laval, 2008. 

[15] P. Djiknavorian, P. Valin, and D. Grenier, Dezert-Smarandache 

theory applied to highly conflicting reports for identification and 

recognition — Illustrative example of ESM associations in dense 
environments, DRDC Valcartier TR 2008- 537, 34 pages. 

[16] P. Djiknavorian, P. Valin, and D. Grenier, Fusion of ESM allegiance 

reports using DSmT, in Advances and Applications of DSmT for 

Information Fusion, Collected Works edited by F. Smarandache, J. 

Dezert,, Volume 3, American Research Press, 2009. 

[17] F. Smarandache, J. Dezert, editors. Advances and Applications of 

DSmT for Information Fusion, vol. 3, American Research Press, 

2009. 

[18] T. Carroll, J. Cooper and P. Tetali, Counting Antichains and Linear 

Extensions in Generalizations of the Boolean Lattice, August 30, 

2009. http://www.math.sc.edu/~cooper/calegbl.pdf 

[19] Lowrance, J. D., Garvey, T. D., and Strat, T. M., A framework for 
evidential reasoning systems, Proceedings of the 5th National 
Conference of the American Association for Artificial Intelligence, 
Philadelphia, 896-903, Aug. 1986. 





336 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


Multiple Ground Target Tracking and 
Classification with DSmT 


Benjamin Pannetier 
Jean Dezert 


Originally published as Pannetier B., Dezert J., Multiple 
ground target tracking and classification with DSmT, 
Informatik 2010, 5th German Workshop in Sensor Data 
Fusion (SDF 2010), Leipzig, Germany, pp. 886-891, Sept 30- 
Oct Ist, 2010, and reprinted with permission. 


Abstract: Based on our previous work we propose to track multiple ground targets 
with GMTI (Ground Moving Target Indicator) sensors as well as with imagery sensors. 
The scope of this paper is to fuse the attribute type information given by heterogeneous 
sensors with DSmT (Dezert Smarandache Theory) and to introduce the type results in 
the tracking process to improve its performances. 


1 Introduction 


Data fusion for ground battlefield surveillance is more and more strategic in order to cre- 
ate the situational assessment or improve the precision of fire control system. For this, we 
develop new ground target tracking algorithms adapted to GMTI (Ground Moving Target 
Indicator) sensors. In fact, GMTI sensors are able to cover a large surveillance area during 
few hours or more if several sensors evolve on the same operational theatre. Several ref- 
erences exist for the MGT (Multiple Ground Tracking) with GMTI sensors [?, 8] whose 
fuse contextual informations with MTI reports. The main results are the improvement of 
the track precision and track continuity. Our algorithm [6] is built with several reflexions 
inspired of this literature. The proposed VS-IMMC (Variable Structure Interacting Mul- 
tiple Models) filter is extended in a multiple target context and integrated in a SB-MHT 
(Structured Branching - Multiple Hypotheses Tracking). 


One way to enhance data associations is to fused data obtained by several sensors. The 
most easily approach is to consider the centralized fusion between two or more GMTI 
sensors. Another way is to introduce heterogeneous sensors in the centralized architecture 
in order to improve the data associations (by using the reports in location and its classifi- 
cation attribute) and palliate the poor GMTI sensor classification. In our previous works 
[6], the classification information of the MTI segments and IMINT segments (IMagery 
INTelligence) has been introduced in the target tracking process. The idea was to main- 
tain aside each target track a set of ID hypotheses. Their committed belief are revised in 
real time with the classifier decision through a very recent and efficient fusion rule called 
proportional conflict redistribution (PCR). 


In this paper, in addition to the measurement location fusion, we illustrate on a complex 
scenario our approach to fuse MTI classification type with image classification type asso- 
ciated to each report. 
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2 Motion & observation models 
2.1 Constrained motion model 


The target state x(k) at the current time ty is defined in a local horizontal plane (O, X,Y) 
of a Topographic Coordinate Frame denoted TCF. The target state on the road segment s 
is defined by x(k) where the target position (x5(k), ys(k)) belongs to the road segment s 
and the corresponding heading (#,(k), ¥,(K)) is in its direction. The event that the target 
is on road segment s is noted e,(k) = {x(k) € s}. Given the event e,(k) and according 
to a motion model M;, the estimation of the target state can be improved by considering 
the road segment s. The constrained motion model M¢ is build in such a way that the 
predicted state is on the road segment s and the gaussian noise is defined under the road 
segment constaint [6]. After the state estimation obtained by a Kalman filter, the estimated 
state is then projected according to the road constraint e,(k). This process is detailed in 


[6]. 


2.2 GMTI measurement model 


According to the NATO GMTI format [5], the MTI reports received at the fusion station 
are expressed in the WGS84 coordinates system. The MTI reports must be converted in the 
TCF. A MTI measurement z at the current time tẹ is given in the TCF. Each MTI report is 
characterized both with the location and velocity information (range radial velocity) and 
also with the attribute information and its probability that it is correct. We denote Cyyrr 
the frame of discernment on target ID based on MTI data. Cyyr7 is assumed to be constant 
over the time and consists in a finite set of exhaustive and exclusive elements representing 
the possible states of the target classification. In this paper, we consider only 3 elements 
in Cyrry defined as Cmrr = {Tracked vehicle, Wheeled vehicle, Rotary wing aircraft}. 


We consider also the probabilities P{c(k)} (Vc(k) € Cmrr) as input parameters of our 
tracking systems characterizing the global performances of the classifier. The vector of 
probabilities [P(c,) P(c2) P(cs)] represents the diagonal of the confusion matrix of the 
classification algorithm assumed to be used. Let z%,7-;(k) the extended MTI measure- 
ments including both kinematic part and attribute part expressed by the herein formula: 


Zrerr(k) = {zmrr(k), c(k), P{e(k)}} (1) 


2.3 IMINT motion model 


For the imagery intelligence (IMINT), we consider two sensor types : a video EO/IR sen- 
sor carried by a Unmanned Aerial Vehicle (UAV) and a EO sensor fixed on a Unattended 
Ground Sensor (UGS). We assume that the IMINT reports Zyideo(k) at the current time ty 
are expressed in the reference frame (O, X, Y) and give a location information and type 
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target. We assume that the video information given by both sensor types are processed by 
their own ground stations and that the system provides the video reports of target detections 
with their classification attributes. For the last point, a human operator selects targets on 
a movie frame and is able to choose its attribute with a HMI (Human Machine Interface). 
Based on the military symbology called 2525C [3], we build the frame of discernment for 
an EO/IR source denoted Cvideo. Each video report is associated to the attribute infor- 
mation c(k)(Vc(k) € Cyideo) with its probability P{c(k)} that it is correct. As Cyrrr, 
C video 1S assumed to be constant over the time and consists in a finite set of exhaustive and 
exclusive elements representing the possible states of the target classification. 


Let Z*ideo(K) be the extended video measurements including both kinematic part and at- 


tribute part expressed by the following formula (Vc(k) € Cyideo): 
Zrideo(k) = {Zvideo(k), c(k), P{c(k)}} (2) 


The attribute type of the image sensors belongs to a different and better classification than 
the MTI sensors. 


2.4 Taxonomy 


In our work, the symbology 2525C [3] is used to describe the links between the different 
classification sets Cmrz and Cyideo. Figure 1 represents a short part of the 2525C used 
in this paper. The red elements underlined in italic style are the atomic elements of our 
taxonomy. Each element of both sets can be placed in 1. For example, the “wheeled ve- 
hicle” of the set Cyyr1 is placed at the level “Armoured — Wheeled” or the “Volkswagen 
Touareg” given by the video is placed at the levels “Armoured — Wheeled— Medium” 
and “Civilan Vehicle — Jeep — Medium”. 


3 Tracking with road constraints 
3.1 VS IMM with a road network 


The IMM is an algorithm for combining state estimates arising from multiple filter models 
to get a better global state estimate when the target is under maneuvers. In section 2.1, a 
constrained motion model 7 to a road segment s, noted M?(k), was defined. We extend the 
segment constraint to the different dynamic models (among a set of r + 1 motion models) 
that a target can follow. The model indexed by r = 0 is the stop model. It is evident that 
when the target moves from one segment to the next, the set of dynamic models changes 
according to the road network configuration. The steps of the IMM under road segment s 
constraint are the same as for the classical IMM as described in [1]. 


In real applications, the predicted state could also appear onto another road segment, be- 
cause of a road turn for example, and we need to introduce new constrained motion models. 
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Figure 1: 2525C (light version). 


In such case, we activate the most probable road segments sets depending on the local pre- 
dicted statelocation of the track T*:'[6]. We consider r + 1 oriented graphs which depend 
on the road network topology. For each graph 2,7 = 0,1,...,7, each node is a constrained 
motion model MM. The nodes are connected to each other according to the road network 
configuration and one has a finite set of r + 1 motion models constrained to a road section. 
The selection of the most probable motion model set, to estimate the road section on which 
the target is moving on, is based on Wald’s sequential probability ratio test (SPRT) [9]. 


3.2 Multiple target tracking 


For the MGT problem, we use the SB-MHT (Structured Branching Multiple Hypotheses 
Tracking) presented in [2]. When the new measurements set Z(k) is received, a standard 
gating procedure is applied in order to validate MTI reports to track pairings. The existing 
tracks are updated with VS-IMMC and the extrapolated and confirmed tracks are formed. 
More details can be found in chapter 16 of [2]. In order to palliate the association problem, 
we need a probabilistic expression for the evaluation of the track formation hypotheses 
that includes all aspects of the data association problem. It is convenient to use the log- 
likelihood ratio (LLR) L!(k) or a track score of a track TE! expressed at current time 
tk. 


4 Target type tracking 
Our approach consists to use the belief on the identification attribute to revise the LLR 


with the posterior pignistic probability on the target type. We recall briefly the Target Type 
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Tracking (TTT) principle and explain how to improve VS-IMMC SB-MHT with target ID 
information. TTT is based on the sequential combination (fusion) of the predicted belief 
of the type of the track with the current “belief measurement” obtained from the target 
classifier decision. The adopted combination rule is the so-called Proportional Conflict 
Redistribution rule no 5 (PCR5) developed in the DSmT (Dezert-Smarandache Theory) 
framework since it deals efficiently with (potentially high) conflicting information. A 
detailed presentation with examples can be found in [4, 7]. 


4.1 Principle of the target type tracker 


To estimate the true target type type(k) at time k from the sequence of declarations c(1), 
c(2), ...c(k) done by the unreliable classifier up to time k. To build an estimator fype(k) 
of type(k), we use the general principle of the Target Type Tracker (TTT) developed in 
[4] which consists in the following steps: 


1. Initialization step (i.e. k = 0). Select the target type frame Cro: = {61,...,On} 
and set the prior bba m~ (.) as vacuous belief assignment, i.e m~ (01U...U0n) = 1 
since one has no information about the first observed target type. 


2. Generation of the current bba Mo»s(.) from the current classifier declaration c(k) 
based on attribute measurement. At this step, one takes m,45(c(k)) = P{c(k)} = 
Ce(k)e(k) and all the unassigned mass 1 — mops(c(k)) is then committed to total 
ignorance 0; U...U On. Celk)e(k) 18 the element of the known confusion matrix C 
of the classifier indexed by c(k)c(k). 


3. Combination of current bba Mobs(.) with prior bba m~ (.) to get the estimation of 
the current bba m(.). 


4. Estimation of True Target Type is obtained from m(.) by taking the singleton of 
O, i.e. a Target Type, having the maximum of belief (or eventually the maximum 
Pignistic Probability). 


5. Set m7 (.) = m(.); do k = k + 1 and go back to step 2). 


Naturally, in order to revise the LLR in our GMTI-MTT system for taking into account the 
estimation of belief of target ID coming from the Target Type Trackers, we transform the 
resulting bba m(.) = [m7 ® mops] (.) available at each time k into a probability measure. 


4.2 Data attributes in the VS IMMC 


To improve the target tracking process, the introduction of the target type probability is 
done in the likelihood calculation. For this, we consider the measurement z} (k)(Vj € 
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{1,...,m}) described in (1) and (2). With the assumption that the kinematic and classi- 
fication observations are independant, it is easy to prove that the new combined likelihood 
A‘, associated with a track T*”' is the product of the kinematic likelihood. 


5 Illustration 


In the extended version of this paper, we will illustrate our algorithm by using a complex 
scenario generated with a powerful simulator developed at ONERA. The area of interest 
is located in a fictive country called North Atlantis. In this scenario, the goal is to detect 
and track several targets with 2 GMTI sensors (STARS, SIDM), 18 UGS and 4 UAV 
(SDTD, in oder to build the situation assessment and evaluate the threat in order to protect 
the coalition forces. On the operation theater, 250 targets evolve, they can maneuver on 
and out the road network. The set of target type is significant, we can have for instance 
civilian vehicles (as 4x4, cars, bus, truck,...) and military vehicles as well (T—62, AMX 
30, Kamakoy,...). llustrations and conclusion of our algorithm will be presented in the 
extended version of this paper. 
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Edge Detection in Color Images Based on DSmT 


Jean Dezert 
Zhun-ga Liu 
Grégoire Mercier 


Abstract—In this paper, we present a non-supervised method- 
ology for edge detection in color images based on belief functions 
and their combination. Our algorithm is based on the fusion of 
local edge detectors results expressed into basic belief assignments 
thanks to a flexible modeling, and the proportional conflict redis- 
tribution rule developed in DSmT framework. The application 
of this new belief-based edge detector is tested both on original 
(noise-free) Lena’s picture and on a modified image including 
artificial pixel noises to show the ability of our algorithm to 
work on noisy images too. 

Keywords: Edge detection, image processing, DSmT, DST, 


fusion, belief functions. 


I. INTRODUCTION 


Edge detection is one of most important tasks in image 
processing and its application to color images is still subject 
to a very strong interest [8], [10]-[12], [14] for example in 
teledetection, in remote sensing, target recognition, medical 
diagnosis, computer vision and robotics, etc. Most of basic 
image processing algorithms developed in the past for gray- 
scale images have been extended to multichannel images. Edge 
detection algorithms for color images have been classified into 
three main families [15]: 1) fusion methods, 2) multidimen- 
sional gradient methods and 3) vector methods depending on 
the position of where the recombination step applies [7]. In 
this paper, the method we propose uses a fusion method with 
a multidimensional gradient method. Our new unsupervised 
edge detector combines the results obtained by gray-scale 
edge detectors for individual color channels [3] to define 
bba’s from the gradient values which are combined using 
Dezert-Smarandache Theory [17] (DSmT) of plausible and 
paradoxical reasoning for information fusion. DSmT has been 
proved to be a serious alternative to well-known Dempster- 
Shafer Theory of mathematical evidence [16] specially for 
dealing with highly conflicting sources of evidences. Some 
supervised edge detectors based on belief functions computed 
from gaussian pdf assumptions and Dempster-Shafer Theory 
can be found in [1], [21]. In this work, we show through very 
simple examples how edge detection can be performed based 
on DSmT fusion techniques with belief functions without 
learning (supervision). The interest for using belief functions 
for edge detection comes from their ability to model more ade- 
quately uncertainties with respect to the classical probabilistic 
modeling approach, and to deal with conflicting information 
due to spatial changes in the image or noises. This paper is 
organized as follows: In section 2 we briefly recall the basics 
of DSmT and the fusion rule we use. In section 3, we present 
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in details our new edge detector based on belief functions and 
their fusion. Results of our new algorithm tested on the original 
Lena’s picture and its noisy version are presented in section 
4 with a comparison to the classical Canny’s edge detector. 
Conclusions and perspectives are given in section 5. 


II. BASICS OF DSMT 


The purpose of DSmT [17] is to overcome the limitations 
of DST [16] mainly by proposing new underlying models 
for the frames of discernment in order to fit better with 
the nature of real problems, and proposing new efficient 
combination and conditioning rules. In DSmT framework, the 
elements 0;, i = 1,2,...,n of a given frame © are not 
necessarily exclusive, and there is no restriction on 0; but their 
exhaustivity. The hyper-power set D© in DSmT, the hyper- 
power set is defined as the set of all composite propositions 
built from elements of © with operators U and N. For instance, 
if © = {01,02}, then D? = {0, 01, 00,01. 02,01 U 02}. A 
(generalized) basic belief assignment (bba for short) is defined 
as the mapping m : D? — [0,1]. The generalized belief and 
plausibility functions are defined in almost the same manner 
as in DST. More precisely, from a general frame O, we define 
a map m/(.) : DE — [0,1] associated to a given body of 
evidence B as 


m(0) = 0 


and 


5 m(A)=1 (1) 


AED® 


The quantity m(A) is called the generalized basic belief 
assignment/mass (or just ”bba” for short) of A. 


The generalized credibility and plausibility functions are de- 
fined in almost the same manner as within DST, i.e. 


Bel(A)= X` m(B) and P(A)= X` m(B) (2) 
BCA BnA#O 
BED? BED® 
Two models! (the free model and hybrid model) in DSmT 
can be used to define the bba’s to combine. In the free 
DSm model, the sources of evidence are combined without 
taking into account integrity constraints. When the free DSm 
model does not hold because the true nature of the fusion 
problem under consideration, we can take into account some 
known integrity constraints and define bba’s to combine using 
the proper hybrid DSm model. All details of DSmT with 


' Actually, Shafer’s model, considering all elements of the frame as truly 
exclusive, can be viewed as a special case of hybrid model. 
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many examples can be easily found in [17] available freely 
on the web. In this paper, we will work only with Shafer’s 
model of the frame where all elements 0; of © are assumed 
truly exhaustive and exclusive (disjoint) and therefore D® 
reduces the the classical power set 2° and generalized belief 
functions reduces to classical ones as within DST framework. 
Aside offering the possibility to work with different underlying 
models (not only Shafer’s model as within DST), DSmT offers 
also new efficient combination rules based on proportional 
conflict redistribution (PCR rules no 5 and no 6) for combining 
highly conflicting sources of evidence. In DSmT framework, 
the classical pignistic transformation BetP(.) is replaced by 
the by the more effective DSmP(.) transformation to estimate 
the subjective probabilities of hypotheses for decision-making 
support once the combination of bba’s has been obtained. 
Before presenting our new edge detector, we just recall briefly 
what are the PCRS fusion rule and the DSmP transformation. 
All details, justifications with examples on PCR5 and DSmP 
can be found freely from the web in [17], Vols. 2 & 3 and 
will not be reported here. 


A. PCRS fusion rule 


The Proportional Conflict Redistribution Rule no. 5 (PCRS) 
is used generally to combine bba’s in DSmT framework. PCRS 
transfers the conflicting mass only to the elements involved in 
the conflict and proportionally to their individual masses, so 
that the specificity of the information is entirely preserved in 
this fusion process. Let mı(.) and m2(.) be two independent? 
bba’s, then the PCRS rule is defined as follows (see [17], Vol. 
2 for full justification and examples): mpcrs(0) = 0 and 
VX € 2° \ {0} 


mpcors(X) = DD mı(Xı)m2(X2)+ 





X1, X2€29 
X1nX2=X 
y [ mı(X Y m(X2) i mə(X}?mı (X2) 8) 
s mı(X) + mo(X2) © m2(X) + mı (X2) 
XonX=0 


where all denominators in (3) are different from zero. If a 
denominator is zero, that fraction is discarded. Additional 
properties of PCR5 can be found in [5]. Extension of PCR5 
for combining qualitative bba’s can be found in [17], Vol. 2 & 
3. All propositions/sets are in a canonical form. A variant of 
PCRS, called PCR6 has been proposed by Martin and Osswald 
in [17], Vol. 2, for combining s > 2 sources. The general 
formulas for PCR5 and PCR6 rules are given in [17], Vol. 
2 also. PCR6 coincides with PCR5 when one combines two 
sources. The difference between PCRS and PCR6 lies in the 
way the proportional conflict redistribution is done as soon as 
three or more sources are involved in the fusion. For example, 
let’s consider three sources with bba’s mı (.), ma(.) and m3(.), 
AN B = Í for the model of the frame ©, and m; (A) = 0.6, 
mə(B) = 0.3, m3(B) = 0.1. With PCR5 the partial con- 
flicting mass m,(A)mo(B)m3(B) = 0.6 - 0.3 - 0.1 = 0.018 


?I.e. each source provides its bba independently of the other sources. 
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is redistributed back to A and B only with respect to the 
following proportions respectively: «4°? = 0.01714 and 








ees = 0.00086 because the proportionalization requires 
yi CRG = PELNY z mı(A)m2(B)m3(B) 
mı(A) mə(B)m3(B) mı(A)+mə(B)m3(B) 
gy PCRS PCRS 0.018 
that is —4__ = 2 __ = ~ 0.02857 
aS 06 0.08 «0.6 + 0.03 


i x4 OE5 = 0.60 - 0.02857 ~ 0.01714 
us 
rhCRS = 0.03 - 0.02857 ~ 0.00086 


With the PCR6 fusion rule, the partial conflicting mass 
mı(A)mə(B)m3(B) = 0.6 - 0.3 -0.1 = 0.018 is redistributed 
back to A and B only with respect to the following proportions 
respectively: ei = 0.0108 and ae = 0.0072 because 
the PCR6 proportionalization is done as follows: 








rhor apa” TBS% ___ m(A)m(B)m:(B) 
mı(A) m(B) m(B) m(A)+mə(B)+m(B) 
that is 
ePCR6  gPOR6 a xbGRe 2 0.018 ig 
0.6 0.3 0.1 0.6 + 0.3 + 0.1 j 
thus 


rhCR6 = 0.6 - 0.018 = 0.0108 
EOFS = 0.3 - 0.018 = 0.0054 
zB? = 0.1 - 0.018 = 0.0018 


and therefore with PCR6, one gets finally the following 
redistributions to A and B: 


zRCR6 — 0.0108 
wECRG — PCR 1 PORS — 0,0054 + 0.0018 = 0.0072 


From the implementation point of view, PCR6 is simpler to 
implement than PCRS. Very basic Matlab codes for PCRS and 
PCR6 fusion rules can be found in [17], [18]. 


B. DSmP transformation 


DSmP probabilistic transformation is a serious alternative to 
the classical pignistic transformation which allows to increase 
the probabilistic information content (PIC), i.e. to reduce 
Shannon entropy, of the approximated subjective probability 
measure drawn from any bba. Justification and comparisons 
of DSmP(.) w.r.t. BetP(.) and to other transformations can 
be found in details in [6], [17], Vol. 3, Chap. 3. DS'mP trans- 
formation is defined? by DSmP-(0) = 0 and VX € 2° \ {Ø} 


5 m(Z)+e-|XNY| 





yooh 
Z|=1 
DSmP.(X) = >) m(Y) (4) 
ye2° 5 m(Z) +e- |Y] 
ZCY 
|Z|=1 


3Here we work on classical power-set, but DSmP can be defined also for 
working with other fusion spaces, hyper-power sets or super-power sets if 
necessary. 
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where |X MY| and |Y | denote the cardinals of the sets X NY 
and Y respectively; € > 0 is a small number which allows to 
increase the PIC value of the approximation of m(.) into a 
subjective probability measure. Usually € = 0, but in some 
particular degenerate cases, when the DSmP-=o(.) values 
cannot be derived, the DSmP.s9 values can however always 
be derived by choosing € as a very small positive number, 
say € = 1/1000 for example in order to be as close as we 
want to the highest value of the PIC. The smaller e, the 
better/bigger PIC value one gets. When e = 1 and when 
the masses of all elements Z having |Z| = 1 are zero, 
DSmP-=1(.) = BetP(.), where the pignistic transformation 
BetP(.) is defined by [19]: 


IY X| 
IY] 





BetP{X}= X. m(Y) (5) 


Ye2° 


with convention |@|/|O| = 1. 


C. DS combination rule 


Dempster-Shafer (DS) rule of combination is the main 
historical (and still widely used) rule proposed by Glenn 
Shafer in his milestone book [16]. Very passionate debates 
have emerged in the literature about the justification and the 
behavior of this rule from the famous Zadeh’s criticism in 
[22]. We don’t plan to reopen this endless debate and just 
want to recall briefly here how it is mathematically defined. 
Let’s consider a given discrete and finite frame of discernment 
© = {61,02,...,@n} of exclusive and exhaustive hypotheses 
(a.k.a satisfying Shafer’s model) and two independent bba’s 
mı(.) and mo(.) defined on 2°, then DS rule of combination 
is defined by mps(0) = 0 and VX £0 and X € 2°: 


1 
mps(X) = I-K > 


X1,X2€2° 
XıNX2=X 


mı(Xı)m2(X2) (© 


where Ky. £ Lx, X228 mMmı(Xı)Mmə(X2) represents the 


; XıNX2=0 
total conflict between sources. If Ky2 = 1, the sources of 


evidence are in full conflict and DS rule cannot be applied. 
DS rule is commutative and associative and can be extented for 
the fusion of s > 2 sources as well. The main criticism about 
such such concerns its unexpected/counter-intuitive behavior 
as soon as the degree of conflict between sources becomes 
high (see [17], Vol.1, Chapter 5 and references therein for 
details and examples). 


D. Decision-making support 


Decisions are achieved by computing the expected utilities 
of the acts using either the subjective/pignistic BetP{.} (usu- 
ally adopted in DST framework) or DSmP(.) (as suggested 
in DSmT framework) as the probability function needed to 
compute expectations. Usually, one uses the maximum of the 
pignistic probability as decision criterion. The maximum of 
BetP{.} is often considered as a prudent betting decision 
criterion between the two other decision strategies (max of 
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plausibility or max. of credibility which appears to be respec- 
tively too optimistic or too pessimistic). It is easy to show that 
BetP{.} is indeed a probability function (see [19], [20]) as 
well as DSmP(.) (see [17], Vol.2). The max of DSmP(.) 
is considered as more efficient for practical applications since 
DSmP(.) is more informative (it has a higher PIC value) than 
BetP(.) transformation. 


III. EDGE DETECTION BASED ON DSMT AND FUSION 


In this work, we use the most common RGB (Red-Green- 
Blue) representation of the digital color image where each 
layer (channel) R, G and B consists in a matrix of n; x nj 
pixels. The discrete value of each pixel in a given color channel 
is assumed in a given absolute interval of color intensity 
[Cmin, Cmax]. The principle of our new Edge detector based 
on DSmT is very simple and consists in the following steps: 


A. Step 1: Construction of bba’s 


Let’s consider a given channel (color layer) and denote it 
as L which can represent either the Red (R) color layer, the 
Green (G) color layer or the Blue (B) color layer, or any other 
channel in a more general case for multispectral images. For 
simplicity, we focus our work and presentation here on color 
images only. 

Apply an edge detector algorithm for each color channel L 
to get for each pixel Lae CS 2p hig SA 2g Nj 
an associated bba mg (.) expressing the local belief that this 
pixel belongs or not to an edge. The frame of discernment © 
used to define the bba’s is very simple and is defined as 


© = {0, = Pixel € Edge, 02 = Pixel ¢ Edge} (7) 


© is assumed to satisfy Shafer’s model (i.e. 6; N 62 = Ø). 
It is clear that many (binary) edge detection algorithms are 
available in the image processing literature but here we want 
a ’smooth” algorithm able to provide both the belief of each 
pixel to belong or not to an edge and also the uncertainty one 
has on the classification of this pixel. In the this subsection, we 
present a very simple algorithm for accomplishing this task at 
the color channel level. Obviously the quality of the algorithm 
used in this first step will have a strong impact of the final 
result and therefore it is important to focus research efforts on 
the development of efficient algorithms for realizing this step 
as best as possible. 

As in Sobel method [9], two 3 x 3 kernels are convolved 
with the original image A” for each layer L to calculate 
approximations of the derivatives - one for horizontal changes, 
and one for vertical. We then obtain two gradient images 
GŁ and Gt for each layer L represent the horizontal and 
vertical derivative approximations for each pixel ah. The x- 
coordinate is defined as increasing in the right-direction, and 
the y-coordinate is as increasing in the down-direction. At 
each pixel ah of the color layer L, the gradient magnitude 
gE can be estimated by the combination of the two gradient 
approximations as: 


1 


gi = TAN + GE(i, j)? (8) 
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where 
=i 0 1 
GŁ=-| -2 0 2 | eA 
-1 0 1 
=Å. es | 
Ga=_| 0 0 0 laa 
1 2 1 


and where » denotes the 2-dimensional convolution operation. 


In Sobel’s detection method, the edge detection for a pixel 
Zij Of a gray image is declared based on a hard thresholding 
of gij value. Such Sobel detector is sensitive to noise and it 
can generate false alarms. In this work, gi values are used 
only to define the mass function (bba) of each pixel in each 
layer over the power-set of O defined in (7). If the value gi 
value of a pixel is big, it implies that this pixel is more likely 
to belong to an edge. If g% value of the pixel ah, is low then 
our belief that it belongs to an edge must be low too. Such 
very simple and intuitive modeling can be obtained directly 
from the sigmoid functions commonly used as activation 
function in neural networks, or as fuzzy membership in the 
fuzzy subsets theory as explained below. 


Let’s consider the sigmoid function defined as 


hii 


~ [4 eG-9 
g is the gradient magnitude of the pixel under consideration. 
t is the abscissa of the inflection point of the sigmoid which 
can be selected by t = p- max(g) where p is a proportion 
parameter and - is the scalar product operator. When working 
with noisy images, p always increases with the level of noise. 
A is the slope of the tangent at the inflection point. 


(9) 


. . G L . . 
It can be easily verified that the bba m;;(.|g;;) satisfying 


the expected behavior can be obtained by the fusion* of the 
two following simple bba’s defined by: 





focal element |  mı(.) ma(.) 
0i fate (9) 0 
02 0 f-d,tn (9) 
01 U 2 1— fatelg) 1> fata (9) 


with 0 < tn < te < 255, A > 0. 


te is the lower threshold for the edge detection, and tn is 
the upper threshold for the non edge detection. Thus, ftn, te] 
corresponds to our uncertainty decision zone and the gE values 
lying in this interval correspond to the unknown decision state. 
The bounds (thresholds) t» and te can be tuned based on the 
average gradients values of the image, and the length te — tn 
depends on the level of the noise. If the the image is very 
noisy, it means the information is very uncertain, and the 
length of the interval [tn,te] can become large. Otherwise, 
it is small. Because of structure of these two simple bba’s, 


4with DS, PCR5 or even with DSmH rule [17]. 


346 


the fusion obtained with PCR5, DS of even with DSm hybrid 
(DSmH) rules of combination provide globally similar results 
and therefore the choice of the fusion rule here does not really 
matter to build m (.|g%) as shown on the figures 1-3. PCRS, 
which is the most specific fusion rule (it reduces the level of 
belief committed to the uncertainty), is used in this work to 


generate m/(.|g/). 


PCRS fusion 


DS fusion DSmH fusion 
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Figure 1. Computation of mf: (.|g/,) from mı (.) and m2(.) with [tn, te] = 
(60, 100] and A = 0.09. 
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Figure 2. Computation of m},(.|g/,) from mi (.) and ma(.) with [tn, te] = 
[50, 80] and and A = 0.06. 
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Figure 3. Computation of m}(.|g/,) from mı (.) and mo(.) with [tn, te] = 
[30, 40] and A = 0.04. 
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In summary, m/.(.|g/) can be easily constructed from the 


choice of thresholding parameters te,tn defining the uncer- 
tainty zone of the gradient values, the slope parameter A of 
sigmoids, and of course from the gradient magnitude gE. This 
approach is very easy to implement and very flexible since it 
depends on the parameters which are totally under the control 
of the user. 


B. Step 2: Fusion of bba’s mẹ (.) 


Many combination rules like DS rule, Dubois & Prade rule 
Yager’s rule, and so on can be used with our approach. In 
this work, we just make investigations based on the two most 
well-known rules (DS and PCRS rule proposed in DST and 
DSmT respectively). So we use either DS or PCRS rule to 
combine the three bba’s mË (.), mG(.) and m#(.) for each 
pixel z;; in order to get the global bba m,,(.) to estimate 
the degree of belief of the belonging of x;; to an edge in the 
given image. Since PCRS is not associative, we must apply the 
general PCR5 formula for combining the 3 sources (channels) 
altogether? as explained in details in [17], Vol.2, Chap. 1 & 
2. A suboptimal approach requiring less computations would 
consist in applying a PCRS sequential fusion of these bba’s in 
such a way that the two least conflicting bba’s are combined at 
first by PCR5 and then combine again with PCRS the resulting 
bba’s with the third one according to (3). The more simple 
PCR6 rule could also be used instead of PCR5 as well - see 
[17], Vol. 2. 


C. Step 3: Decision-making 


The output of step 2 is the set of N; x Nj bba’s mj;(.) 
associated to each pixel x;; of the image in the whole color 
space (R,G,B). mi (.) commits some degree of belief to 
6, = Pixel € Edge, to 02 = Pixel ¢ Edge and also to the 
uncertainty 6; U 02. The binary decision-making process 
consists in declaring if the pixel xj; under consideration 
belongs or not to an edge from the bba m,,(.), or in a 
more complicated manner from m,;(.) and the bba’s of its 
neighbours. In this paper, we just recall the principal methods 
based on the use of m;;(.). 


Based on m;j(.) only, how to decide 0; or 02? Many 
approaches have been proposed in the literature for answering 
this question when working with a n-D frame ©. The pes- 
simistic approach consists in declaring the hypothesis 0; € © 
which has the maximum of credibility, whereas the optimistic 
approach consists in declaring the hypothesis which has the 
maximum of plausibility. When the cardinality of the frame 
© is greater than two, these two approaches can yield to a 
different final decision. In our particular application and since 
our frame © has only two elements, the final decision will 
be the same if we use the max of credibility or the max of 
plausibility criterion. Other decision-making methods suggest, 
as a good balance between aforementioned pessimistic and 
optimistic approaches, to approximate the bba at first into a 


Sie. a generalization of the PCRS formula described in section II-A. 
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subjective probability measure from a suitable probabilistic 
transformation, and then to choose the element of © which 
has the highest probability. In practice, one suggests to take 
as final decision the argument of the max of BetP(.) or of 
the max of DSmP(.). In our binary frame case however these 
two approaches also provide the same final decision as with 
the max of credibility approach. This can be easily proved 
from BetP(.) or DSmP(.) formulas. Indeed, let’s consider 
m(@1) > m(@2) > 0 with m(01) + m(82) + m(6, U 02) = 1 
(which means that 6; is taken as final decision because it has 
a higher credibility than 02), then one gets as approximate 
subjective probabilities: 


Bet P(61) = m(61) + m(0ı U 62) /2 = m(61) + K 








Bet P(62) = m(62) + m(O1 U 62) /2 = m(62) + K 























DSmP(6;) = m(6;)[1 as fra = m(6,)[1+ K] 
E m(ð U2) 7 
DSmP(62) = m(62)[1 (61) + LO = m/(62)[1+ K] 


where K and K’ are two positive constants. From these 
expressions, one sees that if m(@1) > m(02) > 0, then also 
BetP(0,) > BetP(@2) and DSmP(6,) > DSmP(62) and 
thus the final decision based on max of BetP(.) or max of 
DSmP(.) is finally the same. Note that when m(61) = m(62), 
no rational decision can be drawn from m/(.) and only a 
random decision procedure or ad-hoc method can be used in 
such particular case. 

In summary, one sees that when working with a binary 
frame O, all common decision-making strategies provide the 
same final decision and therefore there is no interest to use 
a complex decision-making procedure in that case and that’s 
why we can adopt here the max of belief as final decision- 
making criterion in our simulations. Note that aside the final 
decision and because we have m(6 U 62), we are able (if we 
want) also to plot the level of uncertainty related with such 
decision (not presented in this paper). 


IV. SIMULATIONS RESULTS 


In this section we present the results of our new edge 
detection algorithm tested on two color images for different 
parameter settings. 


A. Test on original Lena’s picture 


Lena Soderberg picture is one of the most used image for 
testing image processing algorithms in the literature [4] and 
therefore we propose to test our algorithm on this reference 
image. This image can be found as part of the USC SIPI 
Image Database in their ’miscellaneous” collection available 
at http://sipi.usc.edu/database/index.php. The original Lena’s 
picture scan is shown on Fig. 4-(a). The figure 5-(a)—(c) shows 
the edge detection on each channel (layer) based on the bba’s 
mý (.|g5) in section III-A. One sees that the edges in different 
channels are different, and the task of our proposed algorithm 
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(a) Original Lena image (b) Lena with noise 


Figure 4. Lena’s picture before and after noise 





(a) Edge by R channel (b) Edge by G channel (c) Edge by B channel 


Figure 5. Edge detections in each channel. 




















Figure 7. 


Sobel’s edge detector on Lena’s gray image. 
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Figure 8. DS-based edge detector on Lena’s color image. 





Figure 9. PCRS edge detector on Lena’s color image. 


is to combine efficiently the underlying bba’s my (| gE) gen- 


erating the subfigures 5-(a)—(c). 


Sobel [9] and Canny [2] edge detectors are commonly used 
in image processing community and that’s why we make 
comparison of our new edge detector w.r.t. Canny’s and So- 
bel’s approaches. Canny and Sobel edge detectors are applied 
directly to the gray image converted from the original Lena 
color image Fig. 4-(a). The figures 6-9 show the results of the 
different edge detectors on Lena’s picture. In our simulations, 
we took A = 0.06, and tą defined as t = p- max(g) in each 
layer, was taken with p,, = 0.17 and pe = 0.19, corresponding 
to gradient thresholds [tẸ, t8] = [15,17], tE, tS] = [13, 14] 
and [t?,¢?] = [11,13]. The max of credibility, plausibility, 
DSmP or BetP for decision-making to generate final result 
provide the same decision as explained in the section II-C 
which is normal in this binary frame case. 

One sees that finally on the clean (noise-free) Lena’s picture, 
our edge detector provides close performances to Sobel’s 
detector applied on Lena’s grey image. Canny’s detector seems 
to provide a better ability to detect some edges in Lena’s 
picture than our method, but it also generates much more false 
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alarms too. It is worth noting that the results provided by DS- 
based or PCR5-based edge detectors show a coarse location 
of the edges. So it is quite difficult to drawn a clear and 
fair conclusion between these edge detectors since it highly 
depends on what we want, i.e. the reduction of false alarms 
or the reduction of miss-detections. 


B. Test on Lena’s picture with noise 


In this simulation, we show how our edge detector works 
on a noisy image. Sampling of independent Gaussian noise 
N (0,07) is added to each pixels of each layer of the original 
Lena’s picture as seen on Fig. 4-(b). In the presented sim- 
ulation, ¢? = 1100 which correspond approximatively to the 
value of the variance of the blue channel and half the variance 
of the others. Local edge detection for each layer based on 
mE (| gE) is shown on Fig. 10-(a)-(c), where the red points 
represent the ignorant pixel which commits the most belief to 
the ignorance 6; U@2. As shown in Fig 10, the edge detection in 
each channel is very noisy. Our method allows to commit auto- 
matically highest belief value to uncertainty for most of pixels 
associated to an edge which actually correspond to noises.° 
The edge detection based on fusion result are interesting as 
shown by Fig.11 and Fig. 12 because it shows the ability of our 
edge detector to suppress the noise effects. For comparison, 
we give on Fig. 13 and Fig.14, the performance of Canny and 
Sobel edge detectors applied classically on the noisy gray- 
level Lena’s picture. In this simulation, we took A = 0.06, 
and ¢ using pn = 0.22 - max(g) and pe = 0.39 - max(g) in 


each layer with [t,t] = [36,20], [t¢,t¢] = [35,19] and 
[t? ,t?] = [31, 18]. The decision-making is still based on max 
of credibility. 


The visual comparison and analysis of results shown of 
figures 11-12 clearly indicates that our edge detector based 
on the fusion of belief constructed on each layer works much 
better than the edge detection applied separately on each 
layer. There is no ignorant pixel corresponding to red color 
according to the fusion results, since the fusion process of DS 
or PCRS rule effectively decrease the uncertainty. Our results 
show also clearly that Canny and Sobel edge detectors applied 
to noisy gray-level Lena’s picture are very sensitive to the 
noise perturbations. Our proposed method (based on DS rule 
or on PCRS rule) is more robust to the noise perturbations 
and provides better results than Sobel or Canny edge detector 
for such noisy image. For this tested image, it appears that 
the results using DS and PCRS rules are very close, because 
there is not too much conflict actually between bba’s of layers 
and one know that in such case PCR5 rule behavior is close 
to DS rule behavior. DS rule is usually good enough in the 
low conflict case, whereas PCRS rule is preferred for the 
combination of high conflicting sources of evidence. So the 
preference of PCRS with respect to DS rule for edge detection 
must be guided by the level of conflict which appears in the 
layers of the color image that we need to process. 


So we are also able at layer level to filter these pixels (false alarms) before 
applying the fusion. This has not yet be done in this work. 
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(a) Edge by R channel (b) Edge by G channel (c) Edge by B channel 


Figure 10. Edge detections in each channel on noisy image. 


\ 


a 


x, 





Figure 11. DS edge detector on noisy Lena’s color image. 


V. CONCLUSIONS AND PERSPECTIVES 


A new unsupervised edge detector for color image based 
on belief functions has been proposed in this work. The basic 
belief assignment (bba) associated with the edge of a pixel 
in each channel of the image is defined according to its 
gradient magnitude, and one can easily model the uncertainty 
about our belief it belong or not to an edge. PCR5 and 
DS rules have been applied in this work to combine these 
bba’s to get the global bba for final decision-making. Other 
rules of combination of bba’s could also have been used 
instead but they are known to be less efficient than PCRS 
or DS rules in high and low conflict cases respectively. The 





Figure 12. PCR5 edge detector on noisy Lena’s color image. 
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Figure 13. Sobel’s edge detector on noisy Lena’s gray image. 
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Figure 14. Canny’s edge detector on noisy Lena’s gray image. 


fusion process is able to reduce noise perturbations because 
the noises are assumed to be independent between channels. 
The final decision making on the edge can be made either 
on the maximum of credibility, plausibility, DSmP or Bet P 
values as well. The first simulation done on original Lena’s 
picture shows that our edge detector works as well as the 
classical Sobel’s edge detector and it provides less false alarms 
than with Canny’s detector, but seems to generate more miss- 
detections. In our second simulation based on noisy Lena 
image, the results show that our new edge detector is more 
robust to the noise perturbations than Sobel or Canny classical 
edge detectors. As possible improvement of this algorithm and 
for further research, we would like to include some morpho- 
logical or connexity constraints at a higher level of processing 
and develop automatic technique for threshold selection. The 
application of this new approach of edge detection to satellite 
multispectral images is under investigations. 
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DSmT Applied to Seismic and Acoustic Sensor Fusion 


Erik P. Blasch 
Jean Dezert 
Pierre Valin 


Abstract — In this paper, we explore the use of DSMT for 
seismic and acoustic sensor fusion. The seismic/acoustic 
data is noisy which leads to classification errors and 
conflicts in declarations. DSmT affords the redistribution 
of masses when there is a conflict. The goal of this paper 
is to present an application and comparison on DSMT 
with other classifier methods to include the support vector 
machine(SVM) and Dempster-Shafer methods. The work 
is based on two key references (1) Marco Duarte with the 
initial SVM classifier application of the seismic and 
acoustic sensor data and (2) Arnaud Martin in Vol. 3 with 
the Proportional Conflict Redistribution Rule 5/6 
(PCR5/PCR6) developments. By using the developments 
of Duarte and Martin, we were able to explore the various 
aspects of DSMT in an unattended ground sensor 
scenario. Using the receiver operator curve (ROC), we 
compare the methods for individual classification as well 
as a measure of overall classification using the area under 
the curve(AUC). Conclusions of the work show that the 
DSMT affords a lower false alarm rate because the 
conflict information is redistributed over the set masses 
and is comparable to other classifier results when using a 
maximum decision forced choice. 


Keywords: Information Fusion, DSMT, PCR5, PCR6, 
Area Under the Curve(AUC), SVM. 


1 Introduction 


The goal of this paper is to present an application 
and comparison on DSMT with other classifier methods. 
The work is based on two key references (1) Marco 
Duarte with the initial classifier application of the 
seismic and acoustic sensor data [1] and (2) Arnaud 
Martin in Vol. 3 for the implementation of the DSMT 
methods. [2] By using the developments of Duarte 
and Martin, we were able to explore the various 
aspects of DSMT in an unattended ground sensor 
scenario. In the exploration of information fusion 
metrics for classification, there is a need to develop 
metrics of effectiveness that support the user’s utility 
needs [3] and can vary over the sensor types, 
environmental conditions, targets of interest, situational 
context, and users [4]. 

DSmT is an extension to the Dempster-Shafer method 
of evidential reasoning which has been detailed 
in numerous papers and texts: Advances and 
applications of DSmT for information fusion (Collected 
works), Vols. 1-3 [5]. 
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Originally published as Blasch E., Dezert J., Valin P., DSmT Applied to 
Seismic and Acoustic Sensor Fusion, Proc. IEEE Nat. Aerospace Electronics 
Conf (NAECON), 2011, and reprinted with permission. 


In 2002, Dezert [6] introduced the methods for 
the reasoning and in 2003, presented the hyper 
power-set notation for DSmT [7]. Recent applications 
include the DSmT Proportional Conflict Redistribution 
tule 5 (PCRS) applied to target tracking [8]. 

The key contributions of DSmT are the redistributions of 
masses such that no refinement of the frame © is possible 
unless a series of constraints are known. For example, 
Shafer’s model [9] is the DSm hybrid model in DSmT. 
Since Shafer’s model, authors have continued to refine the 
method to more precisely address the combination of 
conflicting beliefs [10, 11, 12] and generalization of the 
combination rules [13, 14]. An adaptive combination rule 
[15] and rules for quantitative and qualitative 
combinations [16] have been proposed. Recent examples 
for sensor applications include electronic support 
measures, [17, 18] and physiological monitoring sensors 
[19]. One application of DSmT that has not been fully 
explored is in seismic, magnetic, and acoustic 
classification fusion of moving targets. Kadambe 
conducted an information theory approach [20] and used 
DSmT as integrity constraints [21], but did not take 
advantage of the conflict redistribution. 

Detecting moving vehicles in an urban area [22] is 
an example where DSmT _ conflicting mass 
redistribution could be helpful [8]. Detecting traffic can 
be completed by fixed ground cameras or on dynamic 
unattended ground vehicles (UGVs). If the sensors 
are on UGVS, path planning is needed to route the 
UGVs to observe the traffic [23, 24] and 
cooperation among UGVs_ is necessary[25]. The 
DARPA Grand Challenge featured sensors on mobile 
UGVs observing the environment [26]. Mobile sensing 
can be used to orient [27] or conduct simultaneous 
location and mapping (SLAM) [28]. 

Deployed ground sensors can observe the 
vehicles; however they are subject to the quality of 
the sensor measurements as a well as obscurations. 
One interesting question is how to deploy the fixed 
sensors that optimize the performance of a system. 
Efforts in distributed wireless networks (WSNs) have 
resulted in many issues in distributed processing, 
communications, and data fusion [29]. In a dynamic 
scenario, resource coordination [30] is needed for both 
context assessment, but also the ability to be aware of 
impending situational threats [31, 32]. For distributed 
sensing systems, to combine sensors, data, and user 
analysis requires pragmatic approaches to metrics [33, 
34, 35,36]. For example, Zahedi [37] develops a 
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QOI architecture for comparison of centralized versus 
distributed sensor network deployment planning. 


Information fusion has been interested in the problems 
of databases for target  trafficability[38], sensor 
management [39], and processing algorithms [40] 
from which to assess objects in the environment. Various 
techniques have incorporated grouping object 
movements [41], road information [42, 43], and updating 
the object states based on environmental constraints [44]. 
Detecting, classifying, identifying and tracking objects 
[45] has been important for a variety of sensors, including 
2D visual, radar [46], and hyperspectral [47] data; 
however newer methods are of interest for ground 
sensors with 1D signals. 


Seismic data provides passive sensing of ground 
vibrations which can be used for motion tracking. Passive 
magnetic sensing can detect hidden objects that might 
indicate intent. Finally, acoustic data can be used 
for signature detection from vehicle engines. [48] 
The DARPA SENSIT program investigated 
deploying a distributed set of wireless sensors along a 
road to classify vehicles as shown in Figure 1. 





Figure 1. SENSIT Data from [M. F. Duarte and Y. H. Hu, 
“Vehicle Classification in Distributed Sensor Networks,” 2004 [49] 


The sensors include acoustic and seismic signals. Given 
the deployed set of sensors, feature vectors were used to 
classify signals based on the data from the seismic and 
acoustic signals. [49] Various approaches include 
combining the data with decision fusion [50], value fusion 
[51], and simultaneous track and identification methods 
[52, 53]. Information theoretical approaches including the 
KL method were applied to the data for sensor 
management [54] as shown in Figure 2. Processing sensor 
data for target classification using acoustic [55, 56] and 
seismic [57] results have been explored in support of 
information theoretical sensor placement [58]. 


Much work has been completed using imaging sensors 
and radar sensors for observing and tracking targets. 
Video sensors are limited in power and subject to 
day/night conditions. Likewise, radar line-of site precludes 
them from observing in the same plane. Together, both 
imaging and radar sensors do not have the advantage of 
UGSs which can power on and off, can work for a long 
time on battery power, and can be deployed to remote 
areas. 


Approximate 
Summer 
prevailing 
winds 


AA 


White circles labeled C1- 
C7: Group 3 nodes 
(along the road). Red 
arrow shows IR sensor 


© . 
Ate 


Figure 2. Deployed Sensors. From S. Kadambe and C. 
Daniell, “Theoretic Based Performance of Distributed Sensor 
Networks”, AFRL-IF-RS-TR-2003, 231, October 2003. [54] 





Track management situational awareness tools receive 
input from sensor feeds (examples include electro-optical, 
radar, electronic support measures (ESMs), and sonar) and 
display this information to a user. User inputs include: 
creation of new objects, such as tracks, contacts and 
targets. Methods to reduce data-to-decisions include: 
fusing multiple tracks into a single track, incorporating 
alerting mechanisms, or visualizing track data common 
operational picture (COP). Sensor and track data can grow 
rapidly as the user desires to keep historical data. 

Our goal is to utilize the DSmT method for the fusion of 
information from seismic and acoustic data in which each 
sensor/classifier is in direct conflict with the other sensor. 
We address (1) intelligent use of the data based on value 
for classification, (2) DSMT sensor data fusion for 
detection, classification, and positional location, and (3) 
metrics to support the sensor and data management as 
supporting a user control. 


2 Location / Detection 


We desire to track and identify the targets based on the 
sensor reports. In this study, we concentrate on the 
classification of targets which can be used with the 
kinematic/position information for target identification. 


2.1 Sensor Information Management 


The goal is to utilize the UGSs sensors which may be 
acoustic, magnetic, seismic, and PIRoelectric (passive 
infrared for motion detection. With a variety of sensors, 
information fusion can (a) utilize the most appropriate 
sensor at the correct time, (b) combine information from 
both sensors on a single platform, (c) combine results 
from multiple platforms, and (d) cue other sensors in a 
hand-off fashion to effectively monitor the area. Sensor 
exploitation requires an analysis of feature generation, 
extraction, and selection or (construction, transformation, 
selection, and evaluation). To provide track and ID results, 
we develop method or target classification. 
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2.2 Sensor Classification 


Sensor exploitation includes detection, recognition, 
classification, identification and characterization of some 
object. Individual classifiers can be deployed at each level 
to robustly determine the object information. Popular 
methods include voting, neural networks, fuzzy logic, 
neuro-dynamic programming, support vector machines, 
Bayesian and Dempster-Shafer methods. One way to 
ensure the accurate assessment is to look at a combination 
of classifiers. Combination of classifiers [59] could include 
different sensors with classifiers, different methods over a 
single or multiple sensors, and various hierarchies of 
coordinating the classifiers such as Bayes nets and 
distributed processing. 

Issues in classifier combination methods need to be 
compared as related to decisions, feature sets, and user 
involvement. Selecting the optimal feature set is based on 
the situation and environmental context of which the 
sensors are deployed. An important question for sensor 
and data management is measures of effectiveness. For 
instance, what is the quantification of fusion/decision gain 
using a set of classification methods and placement 
methods? There is a need for a robust combination rule 
that includes the location and detection of the sensors 
subject to the target and environmental constraints. 
Typically, a mobile sensor needs to optimize its route and 
can be subject to interactive effects of pursuers and 
evaders with other targets [60] as well as active 
jamming of the signal [61]. 

Detecting targets from seismic and acoustic data in a 
distributed net centric fashion requires pragmatic 
approaches to sensor and data management. [62] To 
robustly track and ID a target requires both the structured 
data from the kinematic movements as well as the 
unstructured data for the feature analysis. [63] 


3 DSMT 


Here we use PCR6 and PCRS and the DSMP selections 
which are discussed below. We replace Smets’ rule [10] 
by the more effective Proportional Conflict Redistribution 
tule no. 5 (PCRS) or eventually the more simple PCR rule 
no. 6 (PCR6) and replace the pignistic transformation by 
the more effective DSmP transformation to estimate target 
classification probabilities. All details, justifications with 
examples on PCRS and PCR6 fusion rules and DSmP 
transformation can be found freely from the web in the 
DSmT compiled texts [5], Vols. 2 & 3.. 


3.1 PCRS5 and PCR6 fusion rules 


In DSmT (Dezert-Smarandache Theory) framework, the 
Proportional Conflict Redistribution Rule no. 5 (PCRS) is 
used generally to combine the basic belief assignment 
(bba)’s. PCRS transfers the conflicting mass only to the 
elements involved in the conflict and proportionally to 
their individual masses, so that the specificity of the 


information is entirely preserved in this fusion process. 
Let m,(.) and m(.) be two independent bba’s, then the 
PCRS rule is defined as follows (see [5], Vol. 2 for full 
justification and examples): mpcrs(@) = 0 and VX e 2° \ 
{O} 


mpcrs(X) = 5 mıl(Xı\mə(X2)+ 


X1: X9 €29 
XıNnNX2=X 
m4(X)?mo(Xo) 
X -c + 


mə( X mı (X2) 


mil X) +mə( X2) M(X) +mı(Xə) 


where all denominators in the equation above are different 
from zero. If a denominator is zero, that fraction is 
discarded. Additional properties of PCR5 can be found in 
[64]. Extension of PCR5 for combining qualitative bba’s 
can be found in [5], Vol. 2 & 3. All propositions/sets are 
in a canonical form. A variant of PCRS, called PCR6 has 
been proposed by Martin and Osswald in [5], Vol. 2, for 
combining s > 2 sources. PCR6 coincides with PCR5 
when one combines two sources. The difference between 
PCRS and PCR6 lies in the way the proportional conflict 
redistribution is done as soon as three or more sources are 
involved in the fusion. For example, let’s consider three 
sources with bba’s m,(.), m2(.), and m3(.), A N B = © for 
the model of the frame ©, and m(A) = 0.6, m2(B) = 0.3, 
and m3(B) = 0.1. With PCRS the partial conflicting mass 
m(A) mB) mB) = (0.6)(0.3)(0.1) = 0.018 is 
redistributed back to A and B only with respect to the 
following proportions respectively: x° = 0.01714 and 


xp“ = 0.00086 because the proportionalization is [8]: 
xgPeRS xpe PS mil A) m B) my( B) 





mA) m2(B)mxB) m(A) + mB) m3(B) 


PCRS PCRS 
Xp 0.01 


cA oO ns 
0.6 ~ (0.30.1) 0.6+0,03 ~ 202857 


that is 


thus xa” = 0.60 (0.02857) = 0.01714 
xp“ = 0.03 (0.02857) ~ 0.00086 


With the PCR6 fusion rule, the partial conflicting mass 
m(A) mB) mB) = (0.6)(0.3)(0.1) = 0.018 is 
redistributed back to A and B only with respect to the 
following proportions respectively: x4'“®° = 0.0108 and 
xp'®° = 0.0072 because the PCR6 proportionalization is 


done as follows: 


EN xg = xg 2 mı(4) mB) m3(B) 
mA) mB) ~ m(B) m(A) + m(B) + mB) 








that is 


PCR6 PCR6 PCR6 
XA _ XB2 3 0.018 


XB VU 
06 03 — 01 § 064+03+4+0.1 





= 0.018 
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thus 


xa?" = (0.6) (0.018) = 0.0108 
xp *°= (0.3) (0.018) = 0.0054 
kaa = (0.1) (0.018) = 0.0018 


and therefore with PCR6, one gets finally the following 
redistributions to A and B: 


xa?" = (0.6) (0.018) = 0.0108 








xp PE = x5 PR + xy = 0.0054 + 0.0018 = 0.0072 


From the implementation point of view, PCR6 is simpler 
to implement than PCRS. For convenience, Matlab 
codes of PCR5 and PCR6 fusion rules can be found in [5]. 


3.2 


The DSmP probabilistic transformation is an alternative to 
the classical pignistic transformation which allows us to 
increase the probabilistic information content (PIC), i.e. to 
minimize the Shannon entropy, of the approximated 
subjective probability measure drawn from any bba. 
Justification and comparisons of DSmP(.) with respect to 
BetP(.) and to other transformations can be found 
in details in [65, 5 Vol. 3, Chap. 3]. 


The DSmP Transformation 


BetP: The pignistic transformation probability, denoted 
BetP, offers a compromise between maximum of 
credibility Bel and maximum of plausibility PI for 
decision support. The BetP transformation is defined by 
BetP(@) = 0 and VX € G° \ {Ø}by 


Cm ( X N Y ) 


Bet P(X) = - > 
Cm) 


Y EGS 


m(Y ) 


where G? corresponds to the hyper-power set including all 
the integrity constraints of the model (if any). G° = 2° if 
one adopts Shafer’s model for © and G? = D® 
(Dedekind’s lattice) if one adopts the free DSm model for 


© [5]. CM(Y) denotes the DSm cardinal of the set Y, 
which is the number of parts of Y in the Venn diagram of 
the model M of the frame © under consideration [5, Book 
1, Chap. 7]. The BetP reduces to the Transferable Belief 
Model (TBM) when G® reduces to classical power set 2° 
when one adopts Shafer’s model. 


DSmP transformation is defined by DSmP«(®) = 0 and 
WX e G®\ {Ø} by 
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5 m(Z)+e-C(X NY) 


ried 
E . c(Z)=1 . 
DSmPA(X) = J —— mY) 
yege ` m(Z)+e-C(Y) 
ZCY 
c(Z)\=1 


where C(X N Y) and C(Y) denote the cardinals of the sets 
X N Y and Y respectively; e > 0 is a small number which 
allows to reach a highest PIC value of the approximation 
of m(.) into a subjective probability measure. Usually € = 
0, but in some particular degenerate cases, when the 
DSmP,=0(.) values cannot be derived, the DSmP,>+0 values 
can however always be derived by choosing £ as a very 
small positive number, say e = 1/1000 for example in 
order to be as close as we want to the highest value of the 
PIC. The smaller g, the better/bigger PIC value one gets. 
When e = 1 and when the masses of all elements Z having 


C(Z) = 1 are zero, DSmP,=1(.) = BetP(.). 


4 Example/Simulation 


We use the SENSIT data which was described above and 
was provides an unstructured data analysis. To perform 
the data management we use data mining [66] techniques 
such as a support vector machine (SVM) [67, 68] to 
process the unstructured data. Through analysis, we can 
determine the optimum use of the data given 
environmental conditions (i.e. obscurations) and sensor’s 
capabilities to detect a moving target. 

Figure 3 shows the methodology of comparison. A 
key comparison is made between combining all the 
acoustic and seismic data together for testing and 
training via the SVM versus using the outputs from 
the acoustic and seismic data separately from 
which conflicts in classification are detected and 
sent to DS and DSmT processing. 
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Figure 3. Experimentation Flow. 


4.1 


We compare two cases of (1) processing the data 
separately and (2) jointly processing the acoustic 
and seismic results Figure 4 shows the case of the 
acoustic results. 


Data Processing 
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Figure 4. Acoustic Results. 


Figure 5 demonstrates the results for the seismic results. 
Note that for the data set, the seismic results have a lower 
probability of false alarms for target 3 and target 2; 
however, target 2 exhibits more confusion. 
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Next we explore the case of the joint seismic and acoustic 
data management and utilize SVM for classification, 
shown in Figure 6. Note the false alarm reduction which 


is desired by users. 
ROC cure 
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Figure 6. Combined Results 
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In general, the joint analysis supports better decision 
making as confidence was PD was improved for a 
constant false alarm rate, accuracy was improved as to the 
target location from joint spatial measurements, and 
timeliness in decision making as fewer measurements 
were needed to confirm the target ID (i.e. decision made 
with two modalities required fewer measurements than 
that of a single modality). 


4.2 


Below, we show the results of the application of DS 
methods. Given a training and prediction results in a 
combined probability, we have for targetl, target2, and 
target3 a vector of P = [P,P, P3]. Based on the prediction 
results from the SVM, there are many conflicts of the 
sensor decision based on the maximum probability. When 
a conflict occurs, it would be better suited to acknowledge 
the conflict and then redistribute the probabilities based on 
a set notation. In this case, the focal elements are ® = [0;, 
ao 0, ] = ELP, N2, ‘2’, £213’, 63’, 6193’, £1N2M3’]. 
Using the analysis by Martin, we conduct an analysis 
over the set criterion. Figure 7 shows that a 
significant reduction false alarms; however, the overall 
classification as measured by the area under the curve 
(AUC) is less than that of the SVM by itself. Thus, 
there is a trade off when using DS for reducing the PD 
for low FA versus the overall classification analysis. 


Application of DS 
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Figure 7. DS allowing for set declarations. 


To explore a comparison of approaches, we utilized the 
bba and forced the evidential reasoned to choose a single 
target. From this analysis, the AUC improves in 
comparison to the SVM approaches which are a 
forced choice analysis. Figure 8 plots the DS (for 
one target designation). 
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Figure 8. DMST Single Target Detection. 
4.3 Application of DSMT 


DMST, as described above, improves on the methods of 
conflict redistribution. In this case, there were slight 
alterations in the bba comparisons; however with the 
heuristic logic, changes resulted in the classification that 
was comparable to the complete SVM fusion analysis. 

Figure 9 presents the DSMT results for set declaration 
and Figure 10 shows the case of a forced target choice 
from the DSMT. From these plots, we can see that the set- 
based approach improves the detection for low false alarm 
rates; however for high false alarm rates, the detection 
probability is increased over all false alarm rates. Using 
the maximum of the target bba provides an analysis 
threshold that renders the DSmT comparable to a SVM 
(which is allowed to train over all the data available). 
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Figure 9. DSMT allowing for set declarations. 
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Figure 10. DMST Single Target Detection. 


In the table below, we look at the entire analysis using the 
area under the ROC (AUC) as a key metric in the analysis. 
Additionally, there are cases in which the maximum AUC 
and minimum AUC are improved but the overall analysis 
(Total AUC) varies. We see from the comparison that the 
DS and DSMT methods can improve single target 
detection; however the SVM alone (run over all the data) 
does perform slightly higher in the information fusion 
case. 


Table 1: AUC Comparisons of SVM, DS, and DSMT 





























Method Min AUC | Max AUC | Total AUC 
A-SVM 0.786 0.821 2.401 
S-SVM 0.696 0.844 2.335 
C-SVM 0.791 0.851 2.472 
DS 0.671 0.742 2.141 
DS1 0.738 0.833 2.371 
DSMT 0.728 0.751 2.224 
DSMT1 0.760 0.855 2.440 
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A — Acoustic, S-Seismic, C-Combination 


5 Conclusions 


We have explored DS and DSMT methods for seismic and 
acoustic information fusion. The goal of the paper was a 
new application of the existing techniques presented by 
Martin and Durate for further demonstration of the various 
modifications to the DS methods. Using the initial results, 
the use of DSMT can be tailored to the seismic and 
acoustic sensors which demonstrate high conflicts in 
decision outputs as they measure different target 
phenomenologies. We utilized a Bayesian basic belief 
assignment (bba) with only singleton as focal elements 
which from the P vectors of the target probabilities. Future 
work will use non-Bayesian approaches to get the bbas. 

Information theoretic measures [69] and tracking 
analysis [70] can support the sensor and data 
management as well as determine the Quality of 
Information and Quality of Service needs. Use of 
the Area Under the 
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Curve (AUC) provides decision support for situational 
awareness for command and control from which we 
can extend to higher dimensions [71]. Various other 
sources of soft data (human reports) can be combined 
with the hard (physics-based sensing) [72] to 
update the sensor management, placement, and 
reporting of the situation based on the context and 
the needs of users such as measures of effectiveness 
for mission support. 
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The Effective Use of the DSmT for Multi-Class Classification 
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Abstract 


The extension of the Dezert-Smarandache theory (DSmT) for the multi-class framework has a feasible computational 
complexity for various applications when the number of classes is limited or reduced typically two classes. In contrast, when 
the number of classes is large, the DSmT generates a high computational complexity. This paper proposes to investigate the 
effective use of the DSmT for multi-class classification in conjunction with the Support Vector Machines using the One- 
Against-All (OAA) implementation, which allows offering two advantages: firstly, it allows modeling the partial ignorance 
by including the complementary classes in the set of focal elements during the combination process and, secondly, it allows 
reducing drastically the number of focal elements using a supervised model by introducing exclusive constraints when classes 
are naturally and mutually exclusive. To illustrate the effective use of the DSmT for multi-class classification, two SVM- 
OAA implementations are combined according three steps: transformation of the SVM classifier outputs into posterior 
probabilities using a sigmoid technique of Platt, estimation of masses directly through the proposed model and combination 
of masses through the Proportional Conflict Redistribution (PCR6). To prove the effective use of the proposed framework, a 
case study is conducted on the handwritten digit recognition. Experimental results show that it is possible to reduce 
efficiently both the number of focal elements and the classification error rate. 

Keywords: Handwriting digit recognition; Support Vector Machines; Dezert-Smarandache theory; Belief assignments; 


Conflict management 


1. Introduction 


Nowadays a large number of classifiers and methods of generating features is developed in various application areas of 
pattern recognition [1,2]. Nevertheless, it failed to underline the incontestable superiority of a method over another in both 
steps of generating features and classification. Rather than trying to optimize a single classifier by choosing the best features 
for a given problem, researchers found more interesting to combine the recognition methods [2,3]. Indeed, the combination of 
classifiers allows exploiting the redundant and complementary nature of the responses issued from different classifiers. 
Researchers have proposed various approaches for combining classifiers increasingly numerous and varied, which led the 
development of several schemes in order to treat data in different ways [2,3]. Generally, three approaches for combining 
classifiers can be considered: parallel approach, sequential approach and hybrid approach [2]. Furthermore, these ones can be 
performed at a class level, at a rank level, or at a measure level [4-7]. 

In many applications, various constraints do not allow an efficient joint use of classifiers and feature generation methods 


leading to an inaccurate performance. Thus, an appropriate operating method using mathematical approaches is needed, 
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which takes into account two notions: uncertainty and imprecision of the responses of classifiers. In general, the most 
theoretical advances which have been devoted to the theory of probabilities are able to represent the uncertain knowledge but 
are unable to model easily the information which is imprecise, incomplete, or not totally reliable. Moreover, they often lead 
to confuse both concepts of uncertainty and imprecision with the probability measure. Therefore, new original theories 
dealing with uncertainty and imprecise information have been introduced, such as the fuzzy set theory [8], evidence theory 
[9,10], possibility theory [11] and, more recently, the theory of plausible and paradoxical reasoning [12-14]. 

The evidence theory initiated by Dempster and Shafer termed as Dempster-Shafer theory (DST) [9,10] is generally 
recognized as a convenient and flexible alternative to the bayesian theory of subjective probability [15]. The DST is a 
powerful theoretical tool which has been applied in many kinds of applications [16] for the representation of incomplete 
knowledge, belief updating and for the combination of evidence [17,18] through the Dempster-Shafer’s combination rule. 
Indeed, it offers a simple and direct representation of ignorance and has a low computational complexity [19] for most 
practical applications. 

Nevertheless, this theory presents some weaknesses and limitations mainly when the combined evidence sources become 
very conflicting. Furthermore, the Shafer’s model itself does not allow necessary holding in some fusion problems involving 
the existence of the paradoxical information. To overcome these limitations, a recent theory of plausible and paradoxical 
reasoning, known as Dezert-Smarandache theory (DSmT) in the literature, was elaborate by Jean Dezert and Florentin 
Smarandache for dealing with imprecise, uncertain and paradoxical sources of information. Thus, the main objective of the 
DSmT was to introduce combination rules that would allow to correctly combining evidences issued from different 
information sources, even in presence of conflicts between sources or in presence of constraints corresponding to an 
appropriate model (free or hybrid DSm models [12]). The DSmT has proved its efficiency in many current pattern 
recognition application areas such as remote sensing [20-23], identification and tracking [24-29], biometrics [30-33], 
computer vision [34-36], robotics [37-42] and more recently handwritten recognition applications [7,43,44] as well as many 
others [12-14]. 

The use of the DSmT for multi-class classification has a feasible computational complexity for various applications when the 
number of classes is limited or reduced typically two classes [43]. In contrast, when the number of classes is large, the DSmT 
generates a high computational complexity closely related to the number of elements to be processed. Indeed, an analytical 
expression defined by Tombak ef al. [45] shows that the number of elements to be processed follows the sequence of 


Dedekind’s numbers [46,47]: 1,2,5,19,167,7580,7828353,... For instance, if the number of classes belonging to discernment 


space is 8, then the number of elements to be deal in DSmT framework is ~ 5.6x10°”. Hence, it is not easy to consider the 
set of all subsets of the original classes (but under the union and the intersection operators) since it becomes untractable for 
more than 6 elements in the discernment space [48]. Thus, Dezert and Smarandache [49] proposed a first work for ordering 
all elements generated using the free DSm model for matrix calculus such as made in DST framework [50,51]. However, this 
proposition has limitations since in practical applications it is more appropriate to only manipulate the focal elements [7, 52- 
54]. 

Hence, few works have already been focused on the computational complexity of the combination algorithms formulated in 
DSmT framework. Djiknavorian and Grenier [53] showed that there’s a way to avoid the high level of complexity of DSm 


hybrid (DSmH) combination algorithm by designing a such code that can perform a complete DSmH combination in very 
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short period of time. However, even if they have obtained an optimal process of evaluating DSmH algorithm, first some parts 
of their code are really not optimized and second it has been developed only for a dynamic fusion. Martin [55] further 
proposed a practical codification of the focal elements which gives only one integer number to each part of the Venn diagram 
representing the discernment space. Contrary to the Smarandache’s codification [48] used in [56] and the proposed codes in 
[53], author thinks that the constraints given by the application must be integrated directly in the codification of the focal 
elements for getting a reduced discernment space. Therefore, this codification can drastically reduce the number of possible 
focal elements and so the complexity of the DST as well as the DSmT frameworks. A disadvantage of this codification is that 
the complexity increases drastically with the number of combined sources especially when dealing with a problem in the 
multi-class framework. To address this issue, Li et al. [57] proposed a criterion called evidence supporting measure of 
similarity (ESMS), which consists in selecting, among all sources available, only a subset of sources of evidence in order to 
reduce the complexity of the combination process. However, this criterion has been justified for only a two-class problem. 
Nowadays, the complexity of reducing both the number of combined sources and the size of the discernment space are 


research challenges that still need to be addressed. 


In many pattern recognition applications, the classes belonging to the discernment space are naturally and then mutually 
exclusive such as in biometrics [30-33] and handwritten recognition applications [7,43,44]. Hence, several classification 
methods have been proposed as template matching techniques [58-60], minimum distance classifiers [61,62], support vector 
machine (SVM) [63], hidden Markov Models (HMMs) [63-65], neural networks [66,67]. In various pattern recognition 
applications, the SVMs have proved their performance from the mid-1990s comparatively to other classifiers [2]. The SVM 
is based on an optimization approach in order to separate two classes by an hyperplane. In the context of multi-class 
classification, this optimization approach is possible [68] but requiring a very costly duration. Hence, two preferable methods 
of multi-class implementation of SVMs have been proposed for combining several binary SVMs, , which are One Against All 
(OAA) and One Against One (OAO), respectively [69-71]. The former is the most commonly used implementation in the 
context of multi-class classification using binary SVMs, which constructs n SVMs to solve a n -class problem [72]. Each 
SVM is designed to separate a simple class 0; from all the others, i.e., from the corresponding complementary class 
O; = Jo ); - In contrast, the OAO implementation is designed to separate two simple classes 6; and 6; (i+ j), which 
ae 

requires nx(n—1)/ 2 SVMs. Hence, various decision functions can be used such as the Decision Directed Acyclic Graph 
(DDAG) [73] since it has the advantage to eliminate all possible unclassifiable data. 

Generally, the combination of binary classifiers is performed through very simple approaches such as voting rule or a 
maximization of decision function coming from the classifiers. In this context, many combination operators can be used, 
especially in the DST framework [74]. Still in the same vein, some works have been tried out the combination of binary 
classifier originally from SVM in the DST framework [75,76]. For instance, the pairwise approach has been revisited by 
Quest et al. [76-79] in the framework of the DST of belief functions for solving a multi-class problem. In [80], the 
combination method based on DST has been used by Hu et al. for combining multiple multi-class probability SVM classifiers 
in order to deal with distributed multi-source multi-class problem [80]. Martin and Quidu proposed an original approach 


based on DST [81] for combining binary SVM classifiers using OAO or OAA strategies, which provides a decision support 
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helping experts for seabed characterization from sonar images. Burger et al. [82] proposed to apply a belief-based method for 
SVM fusion to hand shape recognition. Optimizing the fusion of the sub-classifications and dealing with undetermined cases 
due to uncertainty and doubt have been investigated by other works [83], through a simple method, which combines the 
fusion methods of belief theories with SVMs. Recently, one regression based approach [84] has been proposed to predict 


membership or belief functions, which are able to model correctly uncertainty and imprecision of data. 


In this work, we propose to investigate the effective use of the DSmT for multi-class classification in conjunction with the 
SVM-OAA implementation, which allows offering two advantages: firstly, it allows modeling the partial ignorance by 
including the complementary classes in the set of focal elements, and then in the combination process, contrary to the OAO 
implementation which takes into account only the singletons, and secondly, it allows reducing drastically the number of focal 


elements from Dedekind (n) to 2xn. The reduction is performed through a supervised model using exclusive constraints. 


Combining the outputs of SVMs within DSmT framework requires that the outputs of SVMs must be transformed into 
membership degree. Hence, several methods of estimating of mass functions are proposed in both DST and DSmT 
frameworks, these ones can be directly explicit through special functions or indirectly explicit through transfer models [9,85- 
88]. In our case, we propose a direct estimation method based on a sigmoid transformation of Platt [89]. This allows us to 


satisfy the OAA implementation constraint. 


The paper is organized as follows. Section 2 reviews the Proportional Conflict Redistribution (PCR6) rule based on DSmT. 
Section 3 describes the combination methodology for multi-class classification using the SVM-OAA implementation. 
Experiments conducted on the dataset of the isolated handwritten digits are presented in section 4. The last section gives a 


summary of the proposed combination framework and looks to the future research direction. 


2. Review of PCR6 combination rule 
In pattern recognition, the multi-class classification problem is generally formulated as a n -class problem where classes are 


associated to patterns classes, namely 6),0,,...,and 6@,. Hence, the parallel combination of two classifiers, namely 
information sources S, and S, , respectively, is performed through the PCR6 combination rule based on the DSmT. For n - 


class problem, a reference domain also called the discernment space should be defined for performing the combination, 
which is composed of a finite set of exhaustive and mutually exclusive hypotheses. 
In the context of the probabilistic theory, the discernment space, namely ©, is composed of n elements as: 


O= {8),0)5--9,}; and a mapping function me |o, 1] is associated for each class, which defines the corresponding mass 


verifying m(@) = 0 and yy (9;)=1 . In Bayesian framework, combining two sources of information by means of the 


weighted mean and consensus based rules seems effective for non-conflicting responses [90-93]. In the opposite case, an 
alternative approach has been developed in DSmT framework to deal with (highly) conflicting imprecise and uncertain 


sources of information [14]. Example of such approaches is PCR6 rule. 
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The main concept of the DSmT is to distribute unitary mass of certainty over all the composite propositions built from 


elements of © with U (Union) and (Intersection) operators instead of making this distribution over the elementary 
hypothesis only. Therefore, the hyper-powerset D® is defined as: 

1. G,0),0..-:0, E DÈ. 

2. If A4,Be D®, then ANBeED® and AUBeED®. 


3. No other elements belong to D® , except those obtained by using rules 1 or 2. 


The DSmT uses generalized basic belief mass, also known as the generalized basic belief assignment (gbba) computed on 


hyper-powerset of © and defined by a map m(.): D? > [o, 1] associated to a given source of evidence which can support 
paradoxical information, as follows: m(@) = 0 and > pe m(A) =1. The combined masses m pcre obtained from m,(.) and 


my(.) by means of the PCR6 rule [13,14] are defined as: 





0 if A, e ®, 

m Á. = 2 : 

rerl ) m, (4,)+ Yim; (4, \L, otherwise. (1) 

k=l 
Where 
Mo (1) Vox (I) 

KE (2) 

i aoe my (4; )+ Me, (1) Yo, (1) 

Yo, (eD® 


p= {® mð} is the set of all relatively and absolutely empty elements, @,, is the set of all elements of D® which have 
been forced to be empty in the hybrid model M defined by the exhaustive and exclusive constraints, Ø is the empty set, the 


denominator m, (4; )+ Mo (1) Yo (0) 18 different to zero, and where o% (1) counts from 1 to 2 avoiding k , i.e.: 


2 if k=l, 
a= ife- o 


Thus, the term m, (4;) represents a conjunctive consensus, also called DSm Classic (DSmC) combination rule [13,14], 


which is defined as: 


0 if A, =Ø, 
m,(4;)= yim, (X)m,(Y) otherwise. (4) 


(x,yep® xnv=4,) 
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3. Methodology 
The proposed combination methodology shown in Fig. 1 is composed of two individual systems using SVMs classifiers. 
Each one is trained using its own source of information providing two kinds of complementary features, which are combined 


through the PCR6 rule. In the following, we give a description of each module composed our system. 


Acquisition 
(Input Data) 
SVMs Classifier SVMs Classifier 
(First descriptor) (Second descriptor) 
DSmT based Parallel 
Combination 


Decision 
















Fig 1. Structure of the combination scheme using SVM and DSmT 


3.1. Classification based on SVM 

The classification based on SVMs has been used widely in many pattern recognition applications as the handwritten digit 
recognition [2]. The SVM is a learning method introduced by Vapnik et al. [94], which tries to find an optimal hyperplane for 
separating two classes. Its concept is based on the maximization of the distance of two points belonging each one to a class. 
Therefore, the misclassification error of data both in the training set and test set is minimized. 

Basically, SVMs have been defined for separating linearly two classes. When data are non linearly separable, a kernel 
function K is used. Thus, all mathematical functions, which satisfy Mercer’s conditions, are eligible to be a SVM-kernel 


[94]. Examples of such kernels are sigmoid kernel, polynomial kernel, and Radial Basis Function (RBF) kernel. Then, the 


decision function f:R? > {a 1+ 1}, is expressed in terms of kernel expansion as: 


Sv 
f(x)= J aryr Klux) +6 (5) 
kl 


where œ, are Lagrange multipliers, Sv is the number of support vectors x, which are training data, such that 0 < a, <C, 


C is a user-defined parameter that controls the tradeoff between the machine complexity and the number of nonseparable 
points [73], the bias b is a scalar computed by using any support vector. 


Finally, for a two-class problem, test data are classified according to: 
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` o (+1) iff(x)>0 (6) 


class (-1) otherwise 


The extension of the SVM for multi-class classification is performed according the One Against-All (OAA) [95]. Let a set of 
N training samples which are separable in n classes {9p sO, >+++29 91} such that {xi > vile R? x {41h k =1,..,.N;i= L.n}. 
The principle consists to separate a class from other classes. Consequently, n SVMs are required for solving n class 


problem. 


3.2. Classification Based On DSmT 
The proposed classification based on DSmT is presented in Fig. 2, which is conducted into three steps: i) estimation of 


masses, ii) combination of masses through the PCR6 combination rule and iii) decision rule. 






sv — (0.4) 


Information source $4 






Estimation of 


SVM}, a (6, :,0,.) masses m] () 











DSmT-based 
combination 
tule 


Decision 
making 


Set of focal 
elements F 


Acquisition 
system 











svm2 < (6.0) 


Estimation of 










Information source S, SVM ay po lo Tal masses m9 () 


n-l>“ n-1 


Fig 2. DSmT-based parallel combination for multi-class classification 


3.2.1. Estimation of Masses 

The difficulty of estimating masses is increased if one assigns weights to the composed classes [96]. Therefore, transfer 
models of the mass function have been proposed whose the aim is to distribute the initial masses on the simple and compound 
classes associated to each source. Thus, the estimation of masses is performed into two steps: i) assignment of membership 
degrees for each simple class through a sigmoid transformation proposed by Platt [89], ii) estimation of masses of simple 


classes and their complementary classes using a supervised model, respectively. 


e Calibration of the SVM outputs: Although, standard SVM is very discriminative classifier, its output values are not 
calibrated for appropriately combining two sources of information. Hence, an interesting alternative is proposed in [89] to 


transform the SVM outputs into posterior probabilities. Thus, given a training set of instance-label pairs 
Kary k =1,.... N}, where x, ER? and y, € {+141}, the unthresholded output of an SVM is a distance measure 
between a test pattern and the decision boundary as given in (5). Furthermore, there is no clear relationship with the 


posterior class probability P(y = +x) that the pattern x belongs to the class y=+1. A possible estimation for this 
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probability can be obtained [89] by modeling the distributions Pf ly = +1) and P(t | y =-1) of the SVM output f (x) 


using Gaussian distribution of equal variance and then compute the probability of the class given the output by using 


Bayes’ rule. This yields a sigmoid allowing to estimate probabilities: 





A 1 
ran 1 = 
Ply i p) 1+exp(4x f(x)+B) 0) 
Parameters A and B are tuned by minimizing the negative log-likelihood of the training data: 
N 
~ Yin, log(Q, )+ (l-t; Jlog(1- 0; ) (8) 
k=l 


As 1 wit 
where Q, = Ply, = ix) and t, = + denotes the probability target. 


e Supervised Model: Denoting m,(.) and m,(.) the gbba provided by two distinct information sources S, (First descriptor) 





and S, (Second descriptor), F is the set of focal elements for each source, such that F = {0 Os. Oh creat. 


the classes 0; are separable (One relatively to its complementary class 6.) using the SVM-OAA multi-class 
implementation corresponding to different singletons of the patterns assumed to be known. Therefore, each compound 


element A, ¢F has a mass m, equal to zero, on the other hand, the mass of the complementary element O; = Jo „is 
Os j<n-l 
j#i 


different from zero, which represents the mass of the partial ignorance. The same reasoning is applied to the classes issued 


from the second source S, and m,(.). Hence, both gbba m,(.) and m,(.) are given as follows: 





m,(0,) = LO), vo EF (9) 
Z, 
nl a 
ACH) 
j=0 
m,(6;)= = , Yð; € F (10) 
Zp 
m,(4;)=0, VA; €® =D°\ F (11) 
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aA MSEE : : : i : 
where Z, = ys o7 (0 f |x) represent the normalization factors introduced in the axiomatic approach in order to respect 
j= 


the mass definition, P, are the posterior probabilities issued from the first source (b=1) and the second source (b= 2), 


respectively. They are given for a test pattern x as follows: 


1 


: 12 
1+exp(4,, x fip(x)+ Bin) oo 





P,(0,x)= 


Aip and Bp are the parameters of the sigmoid function tuned by minimizing the negative log-likelihood during training for 
each class of patterns 0; , and fi (x) is the 7 -th output of binary SVM classifier SVM p issued from the source S, , such that 
i =0,1,...,n—1 and be 1,2}. 

In summary, the masses of all elements 4; € D® allocated by each information source S, (b = 1,2) are obtained according 


the following steps: 
1. Define a frame of discernment © = 10,02. ‘ 0, } 


2. Classify a pattern x through the SVM-OAA implementation. 
3. Transform each SVM output to the posteriori probability using Eq. (12). 


4. Compute the masses associated to each class and its complementary using Eq. (9) and Eq. (10), respectively. 


3.2.2. Combination of masses 


In order to manage the conflict generated from the two information sources S, and S, (i.e. both SVM classifications), the 


combined masses are computed as follows: 


where © defines the PCR6 combination rule as given in (1). Hence, in the context of some application of pattern recognition 


area, such as handwritten digit recognition, we take as constraints the propositions (06,10; =Ø, V6,,0; €@), such that 


i+ j, which allow separating between each two classes belonging to ©. 


Therefore, the hyper power set D® is reduced to the set F as F= PE eea which defines a particular 


case of the Shafer’s model. Thus, the conflict K, (€ [o, 1) measured between two sources is defined as: 


K, = dim (4)xm (4) (14) 
Ap, AEF 
ANA EP 
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where ®=D® \ F is the set of all relatively and absolutely empty elements, m,(.) and m, () represent the corresponding 


generalized basic belief assignments provided by two information sources S, and S, , respectively. 


3.2.3. Decision rule 
A membership decision of a pattern to one of the simple classes of © is performed using the statistical classification 
technique. First, the combined beliefs are converted into probability measure using a new probabilistic transformation, called 


Dezert-Smarandache probability (DSmP), that maps a belief measure to a subjective probability measure [14] defined as: 


a dyme(Ay) +6 Cy (Aj) 
AO A,e2® 
Cy (4;)22 4,4} 

Cy (A )=I 





DSmP,(0,) = m,(;) + (m,(6;)+ £) (15) 


where i = {0,1,. ; o9}, € 20 isa tuning parameter, M is the Shafer’s model for ©, and Cy (4p) denotes the DSm cardinal 


of A, [12]. Therefore, the maximum likelihood (ML) test is used for decision making as follows: 
x €0, if DSmP, (0,) = max DSmP, (0,),0<j< Ji (16) 


where x is the pattern test characterized by both descriptors, which are used during the feature generation step, and ¢ is 


fixed to 0.001 in the decision measure given by (15). 


4. Experimental results 

4.1. Database description and performance evaluation 

For evaluating the effective use of the DSmT for multi-class classification, we consider a case study conducted on the 
handwriting digit recognition application. For this, we select a well-known US Postal Service (USPS) database that contains 
normalized grey-level handwritten digit images of 10 numeral classes, extracted from US postal envelopes. All images are 
segmented and normalized to a size of 16x16 pixels. There are 7291 training data and 2007 test data where some of them 


are corrupted and difficult to classify correctly (Fig. 3). The partition of the databse for each class according tranining and 


a t A $ 


6 8 9 


testing is reported in table 1. 


7 S&S & y 


Fig 3. Some samples with their alleged classes from USPS database. 
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Table 1. Partitioning of the USPS dataset 








Classes 0 1 2 3 4 5 6 7 8 9 
Training 1194 1005 731 658 652 556 664 645 542 644 
Testing 359 264 198 166 200 160 170 147 166 177 





For evaluating performances of the handwritten digit classification, a popular error is considered, which is the Error Rate per 


Class (ERC) and Mean Error Rate (MER) for all classes. Both errors are expressed in %. 


3.2. Pre-processing 

The acquired image of isolated digit should be processed to facilitate the feature generation. In our case, the pre-processing 
module includes a binarization step using the method of Otsu [97], which eliminates the homogeneous background of the 
isolated digit and keeps the foreground information. Thus, we use the processed digit without unifying size image for 


recognition process. 


3.3. Feature Generation 

The objective of the feature generation step is to underline the relevant information that initially exists in the raw data. Thus, 
an appropriate choice of the descriptor improves significantly the accuracy of the recognition system. In this study, we use a 
collection of popular feature generation methods, which can be categorized into background features [98,99], foreground 


features [98,99], geometric features [2], and uniform grid features [100,101]. 


3.4. Validation of SVM Models 

The SVM model is produced for each class according the used descriptor. Hence, the training dataset is partitioned into two 
equal subsets of samples, which are used for training and validating each binary SVM, respectively. Thus, the validation 
phase allows finding the optimal hyperparameters for the ten SVM models. In our case, the RBF kernel is selected for the 


experiments. Furthermore, both the regularization and RBF kernel parameters (c o) of each SVM are tuned experimentally 


during the training phase in such way that the misclassification error of data in the training subset is zero and the validation 
test gives a minimal error during validation phase for each SVM separating between a simple class and its complementary 
class. 

Table 2 shows an example of the optimal parameters, which are obtained during both training and validation phases by using 
the UG-SVMs classifier. The parameters n and m define the number of the lines (vertical regions) and columns (horizontal 
regions) of the grid, respectively, which have been optimized during the validation phase for each SVM model. Therefore, 
these all parameters are used afterwards during the testing phase. ERCs and ERCc are the Error Rates per Class for simple 
and complementary classes, respectively. As we can see, the choice of the optimal size of the uniform grid and 


hyperparameters of each SVM should be tuned carefully in order to produce a reduced error. 
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Table 2. Optimal parameters of the UG-SVMs classifier 








P i SVM Classifier 

Snes 0 1 2 3 4 5 6 7 8 9 
n 7 2 8 5 4 7 7 8 8 7 
m 5 3 3 6 12 5 8 6 6 10 
z 3.5 1 35 4 3 35 4 35 5 4.5 
C 5 3 4 5 4 4 2 4 3 5 
ERCs (%) 2.0 10 46 57 156 100 27 55 118 40 
ERCc (%) 0.6 11 04 03 01 03 01l 01 03 04 


3.5. Quantitative results and discussion 
The testing phase is performed using all samples from the test dataset. Hence, the performance of the handwritten digit 
recognition classification is evaluated on an appropriate choice of descriptors using the SVM classifiers and then we evaluate 


the combination of the SVMs classifiers within DSmT framework. 


3.5.1. Comparative analysis of features 

The choice of the complementary features is an important step to ensure efficiently the combination. Indeed, the DSmT-based 
combination allows offering an accurate performance when the selected features are complementary. Hence, we propose in 
this section the performance of features in order to select the best ones for combining through the DSmT. For this, we 
evaluate each SVM-OAA implementation using Foreground Features (FF), Background Features (BF), Geometric Features 
(GF), Uniform Grid Features (UGF), and the descriptors deduced from a concatenation between at least two simple 
descriptors such as (BF,FF), (BF,FF,GF) and (UGF,BF,FF,GF). Indeed, the experiments have shown that the appropriate 
choice of both descriptors and concatenation order to represent each digit class in the feature generation step provides an 
interesting error reduction. In table 3, FF and UGF-based descriptors using SVM classifiers are evaluated. When 
concatenating background and foreground (BF,FF)-features, we observe a significant reduction of the MER. Indeed, an error 
rate reduction of 6.71% is obtained when concatenating BF and FF, respectively. Furthermore, an error rate reduction of 
1.5% is obtained when concatenating BF, FF and GF, respectively. This proves that BF, FF and GF are complementary and 
are more suitable for concatenation. In contrast, when concatenating UGF with BF, FF and GF, the MER is increased to 
2.73% comparatively to UGF. This proves that the concatenation does not always allow improving the performance of the 


classification. Thus, we expect that the UGF and (BF,FF,GF) descriptors are more suitable for combining through the DSmT. 


Table 3. Mean error rates of the SVM classifiers using different methods of feature generation 








Descriptor MER (%) 
(a) FF 18.87 
(b) (BF,FF) 12.16 
(c) (BF,FF,GF) 10.66 
(d) UGF 6.98 
(e) (UGF,BF,FF,GF) 9.71 
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3.5.2. Performance evaluation of the proposed combination framework 

In these experiments, we evaluate a handwritten digit recognition classification based on a combination of SVM classifiers 
through DSmT. The proposed combination framework allows exploiting the redundant and complementary nature of the 
(BF,FF,GF) and UGF-based descriptors and manage the conflict provided from the outputs of SVM classifiers. 

Decision making will be only done on the simple classes belonging to the frame of discernment. Hence, we consider in both 


combination process and calculation of the decision measures the masses associated to all classes representing the partial 


ignorance O; = Je ); and 6, N 0; such that i + j. Thus, in order to appreciate the advantage of combining two sources of 
0<j<n-1 
j+i 

information through the DSmT-based algorithm, Figure 4 shows values of the distribution of the conflict measured for each 


test sample between both SVM-OAA implementations using (BF,FF,GF) and UGF-based descriptors for the 10 digit classes 


(0i = 0,1,...,9), respectively. Table 4 reports the minimal and maximal values of the conflict (Kai = 0,1,...,9) generated 


through the supervised model, which represent the mass assigned to the empty set, after combination process. As we can see, 


the conflict is maximal for the digit 4 while it is minimal for the digit 9. 
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Fig 4. Conflict between both SVMs classifiers using (BF,FF,GF) and UGF-based descriptors for the ten digit classes 
(0;,i = 0,1... 9), respectively. 


Table 4. Ranges of conflict variations measured between both SVM-OAA implementations using (BF,FF,GF) and UGF- 


based descriptors 





Class Minimal conflict (10°) Maximal conflict (107) 





0 2.149309 2.9933 
1 6.999035 2.9964 
2 2.747717 2.9992 
3 2.936855 2.9994 
4 0.494599 3.0000 
5 1.868961 2.9970 
6 2.537015 2.9887 
7 2.826402 2.9983 
8 1.485899 2.9910 
9 0.276778 2.9999 





For an objective evaluation, Table 5 shows ERC and MER produced from three SVM-OAA implementations using UGF, 
(BF,FF,GF), the descriptor resulting from a concatenation of both UGF and (BF,FF,GF) (i.e. combination at features level) 
and finally the PCR6 combination rule (i.e. combination at measure level) performed on (BF,FF,GF) and UGF based 


descriptors, respectively. 


Table 5. Error rates of the proposed framework with PCR6 combination 


tule using (BF,FF,GF) and UGF descriptors 














Descriptor Concatenation Combination rule 
ERC (%) (BF,FF,GF) UGF (UGF,BF,FF,GF) PCR6 
0 6.69 1.95 9.75 1.95 
1 4.55 3.79 3.79 3.03 
2 12.63 8.08 3.54 6.06 
3 17.47 10.84 18.67 10.84 
4 20.00 11.50 19.50 9.00 
5 16.87 10.00 10.62 7.50 
6 2.94 5.29 4.71 3.53 
7 8.84 8.16 8.84 4.76 
8 12.05 10.84 10.24 6.63 
9 10.73 6.21 10.17 5.65 
MER (%) 10.66 6.98 9.71 5.43 





Overall, the proposed framework using PCR6 combination rule is more suitable than individual SVM-OAA implementations 


since it provides a MER of 5.43% comparatively to the concatenation which provides a MER of 9.71%. However, when 
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inspecting carefully each class, we can note that the PCR6 combination rule allows keeping or reducing in the most cases the 


ERC except for the samples belonging to classes 6, and 6; .This bad performance is due to the wrong characterization of 


both UG and (BF,FF,GF)-based descriptors. In other words, the PCR6 combination is not reliable when the complementary 
information provided from both descriptors is wrongly preserved. 
Thus, PCR6 combination rule allows managing correctly the conflict generated from SVM-OAA implementations, even 


when they provide very small values of the conflict (see Table 4) specifically in the case of samples belonging to 6. Thus, 


the DSmT is more appropriate to solve the problem for handwritten digit recognition. Indeed, the PCR6 combination rule 
allows an efficient redistribution of the partial conflicting mass only to the elements involved in the partial conflict. After 
redistribution, the combined mass is transformed into the DSm probability and the maximum likelihood (ML) test is used for 
decision making. Finally, the proposed algorithm in DSmT framework is the most stable across all experiments whereas 


recognition accuracies pertaining to both individual SVM classifiers vary significantly. 


4. Conclusion and future work 

In this paper, we proposed an effective use of the DSmT for multi-class classification using conjointly the SVM-OAA 
implementation and a supervised model. Exclusive constraints are introduced through a direct estimation technique to 
compute the belief assignments and reduce the number of focal elements. Therefore, the proposed framework allows reducing 
drastically the computational complexity of the combination process for the multi-class classification. A case study conducted 
on the handwritten digit recognition shows that the proposed supervised model with PCR6 rule yields the best performance 
comparatively to SVM multi-classifications even when they provide uncalibrated outputs. In continuation to the present 
work, the next objectives consist to adapt the use of one-class classifiers instead of the OAA implementation of SVM in order 
to obtain a fixed number of focal elements within DSmT combination process. This will allow us to have a feasible 


computational complexity independently of the number of combined sources and the size of the discernment space. 
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Intelligent Alarm Classification Based on DSmT 


Albena Tchamova 
Jean Dezert 


Abstract—In this paper the critical issue of alarms’ classi- 
fication and prioritization (in terms of degree of danger) is 
considered and realized on the base of Proportional Conflict 
Redistribution rule no.5, defined in Dezert-Smarandache Theory 
of plausible and paradoxical reasoning. The results obtained show 
the strong ability of this rule to take care in a coherent and 
stable way for the evolution of all possible degrees of danger, 
relating to a set of a priori defined, out of the ordinary dangerous 
directions. A comparison with Dempster’s rule performance is 
also provided. Dempster’s rule shows weakness in resolving the 
cases examined. In Emergency case Dempster’s rule does not 
respond to the level of conflicts between sound sources, leading 
that way to ungrounded decisions. In case of lowest danger’s 
priority (perturbed Warning mode), Dempster’s rule could cause 
a false alarm and can deflect the attention from the existing real 
dangerous source by assigning a wrong steering direction to the 
surveillance camera. 

Keywords—Alarm classification; DSmT; DST; data fusion. 


I. INTRODUCTION 


The alarms classification and prioritization is a very 
challenging and difficult task. The encountered overflowing 
amount of alarms could become a serious source of confusion 
especially in dangerous cases, when one needs to take a proper 
immediate response. The problem is really critical, because 
the information available for performing alarms processing is 
uncertain, imprecise, even conflicting. There are cases, when 
some of the alarms generated could be incorrectly interpreted 
as false, increasing the chance to be ignored, in case when 
they are really significant and dangerous. That way the critical 
delay of the proper response could cause significant damages. 

A lot of work was done during the years, because the 
importance of this problem was recognized since the 1960s, in 
wide world cases of surveillance: in industry (powerplants, oil 
refineries), the clinical alarms in medicine, civilian and mili- 
tary monitoring. Nowadays surveillance (military and civilian) 
and environmental monitoring systems are characterized with 
a smart operational control, based on the intelligent analysis 
and interpretation of alarms coming from a variety of sensors 
installed in the observation area. Many approaches have been 
adopted and applied, addressing the problem in common. In 
[1] a generic neuro-expert system architecture for training 
neural networks in alarm processing is developed, which 
is satisfactory when the training set covers enough range 
of scenarios. An expert system with temporal reasoning for 
alarm processing is proposed in [2]. Fault detection and alarm 
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processing in a loop system using a fault detection system is 
presented in [3]. In [4] the authors consider a methodology, 
based on both artificial neural networks and fuzzy logic for 
alarm identification. The tasks of alarm processing, fault diag- 
nosis and comprehensive validation of protection performance 
are discussed and resolved in [5] using knowledge-based 
systems and model-based reasoning approach. In [6] alarm 
prioritization, using fuzzy logic is developed to prioritize the 
alarms during alarm floods which would ease the burden of 
operators with meaningless or false alarms. In case of multiple 
suspicious signals, generated from a number of sensors in the 
observed area, the problem of alarm classification requires 
the most dangerous among them to be correctly recognized, 
in order to decide properly where the video camera should 
be oriented. Because of uncertainty and conflicts encountered 
in signals’ data, one needs to process, analyze and inter- 
pret correctly in timely manner all suspicious sound signals 
separately at particular sensor’s levels in the observed area. 
Such kind of conflicts could weaken or even mistake the 
decision about the degree of danger in a critical situation. 
That is why a strategy for an intelligent, scan by scan, 
combination/updating of sounds data generated by each sensor 
is needed in order to provide the surveillance system with 
a meaningful output. There are various well known methods 
for combining information, which could be applied. The most 
used until now Dempster-Shafer Theory (DST) [9] proposes 
a suitable mathematical model for uncertainty representation, 
but its weak point in applications relates to the normalization 
factor, which yields to non-adequate results when sources to 
combine are highly conflicting. To overcome such drawback, 
we apply the Proportional Conflict Redistribution Rule no.5 
(PCRS), defined in Dezert-Smarandache Theory (DSmT) of 
plausible and paradoxical reasoning [7]. It proposes a pow- 
erful and efficient way for combining and utilizing all the 
available information, allowing the possibility for conflicts and 
paradoxes between the elements of the frame of discernment. 
A comparison with DST performance based on Dempster’s 
rule of combination! is also provided in order to evaluate 
the ability of DSmT to assure awareness about the alarms’ 
classification and prioritization in case of sound source data 
discrepancies and to improve decision-making process about 
the degree of danger. In section II we recall basics of DST and 
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Dempster’s rule. Basics of PCRS fusion rule are outlined in 
section III. Section IV relates to the decision making support 
used in order to decide which sound source is most dangerous. 
In section V, we present the problem of alarms classification 
and examine two solutions to solve it by using PCR5 and 
Dempster’s rule. In section VI, the evaluation and comparative 
analysis of both solutions are provided on a given simulation 
scenario, that includes three sensors, generating three types of 
signals (warning, alarm and emergency). Concluding remarks 
are given in section VII. 


II. BASICS OF DST 


DST [9] proposes a suitable mathematical model for un- 
certainty representationLet © = {61,62,...,9,} be a frame 
of discernment of a problem under consideration containing n 
distinct elements 0;, i = 1,...,n. A basic belief assignment 
(bba, also called a belief mass function) m/(.) : 2° — [0,1] 
is a mapping from the power set of O (i.e. the set of subsets 
of ©), denoted 2°, to [0,1], that must satisfy the following 
conditions: 1) m(@) = 0, i.e. the mass of empty set (impossible 
event) is zero; 2) $ xeo M(X) = 1, ie. the mass of belief 
is normalized to one. m(X) represents the mass of belief 
exactly committed to X. The vacuous bba characterizing full 
ignorance is defined by m,(.) : 2° — [0;1] such that 
M(X) = 0 if X FA O, and m,(©) = 1. From any bba 
m/(.), the belief function Bel(.) and the plausibility function 
Pl(.) are defined as VX € 2° : Bel(X) = Vyjycx MY) 
and PI(X) = Yiyixqyyzg MAY). Bel(X) and PI(X) are 
classically seen as lower and upper bounds of an unknown 
probability P(X) of X. Dempster-Shafer (DS) rule of com- 
bination [9] is a mathematical operation, denoted 6, which 
corresponds to the normalized conjunctive fusion rule. Based 
on Shafer’s model of the frame, the combination of two 
independent and distinct sources of evidences characterized by 
their bba mı(.) and mə2(.) and related to the same frame of 
discernment © is defined by mps(0) = 0, and VX € 2°\ {0} 
by 


mps(X) = [m © m2](X) = mei (1) 
where 
mi2(X) ê m1(X1)m2(X2) (2) 
X1,X2€2° 
Xi NXo=X 


corresponds to the conjunctive consensus on X between the 
two sources of evidence. Kj, is the total degree of conflict 
between the two sources of evidence defined by 


Kız = m42(0) = 5 mı(Xı)m2(X2) (3) 


X1,X2€2° 
X1NXe=0 


DS rule is commutative and associative. The weak point 
of this rule is its behavior when Kı29 — 1 because it can 
generate unexpected (at least very disputable) results [11]. 
When Ki2 = mj2(0) = 1, the two sources are said to 
be in total conflict and their combination cannot be applied 
since DS rule is mathematically not defined because of 0/0 
indeterminacy [9]. 


III. BASICS OF PCR5 FUSION RULE 


The idea behind the Proportional Conflict Redistribution 
tule no. 5 (see [7], Vol. 3) is to transfer conflicting masses 
(total or partial) proportionally to non-empty sets involved in 
the model according to all integrity constraints. The general 
principle of PCR rules is then to: /) calculate the conjunctive 
consensus between the sources of evidences; 2) calculate 
the total or partial conflicting masses; 3) redistribute the 
conflicting mass (total or partial) proportionally on non-empty 
sets involved in the model according to all integrity constraints. 
Under Shafer’s model assumption of the frame ©, the PCR5 
combination rule for only two sources of information is 
defined as: mpcrs(0) = 0 and VX € 2° \ {ø} 


Mpors(X) = M12(X)+ 


mi (X mY) 
Deore Ns 


mg, (Xm (Y) 
mə(X) +m (Y) 





] 4) 

YE2°\ {x} 

XnY=0 

where ™mj2(X) corresponds to the conjunctive consensus on 
X between the two sources and where all denominators are 
different from zero. All sets involved in the formula are in 
canonical form. All denominators are different from zero. If a 
denominator is zero, that fraction is discarded. No matter how 
big or small the conflicting mass is, PCR5 mathematically 
does a better redistribution of the conflicting mass than DS 
since PCR5 goes backwards on the tracks of the conjunctive 
rule and redistributes the partial conflicting masses only to the 
sets involved in the conflict and proportionally to their masses 
put in the conflict, considering the conjunctive normal form 
of the partial conflict. PCR5 is quasi-associative and preserves 
the neutral impact of the vacuous belief assignment. 


IV. DECISION-MAKING SUPPORT 


In this work, we assume Shafer’s model and we use the 
classical Pignistic Transformation [7], [10] to take a deci- 
sion about the mode of danger. The pignistic probability 
(Pign.Proba), also called the betting probability (BetP) is 
defined for VA € 2° by 


IX Al 


BetP(A) = xi 


XED®2 





m(X) (5) 


where |X| denotes the cardinality of X. 


V. ALARMS CLASSIFICATION APPROACH 


Our approach for alarms classification assumes all the local- 
ized sound sources to be subjects of attention and investigation 
for being indication of dangerous situations. The specific 
attributes of input sounds, emitted by each source, are sensor’s 
level processed and evaluated in timely manner for their 
contribution towards correct alarms’ classification (in term of 
degree of danger). The input sounds attributes generated by 
each sensor, at each time moment (scan) concern the frequency 
of intermittence, fint and sound signal duration, Tsig. A 
particular relationship between the specific values of fint and 
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associated corresponding degree of danger is established, i.e 
to map input specific sensor level data into the frame of 
discernments, concerning the level of abstraction Degree of 
Danger= {Emergency, Alarm, Warning}. Then the process 
consists in temporal sensors’ level sound signals’ attribute 
updating on the base of PCR5 fusion rule. Our motivation for 
attribute fusion is inspired from the necessity to ascertain the 
degree of danger, associated with all localized sound sources 
separately, in order to quickly focus on the most dangerous 
alarm information and to take immediate and correct feedback 
actions to decide properly where the video camera should be 
oriented. The applied algorithm considers the following steps: 
e We define the frame of expected hypotheses according to 
the respective degree of danger associated with the attributes’s 
specific values as follows: © = {0; = (E)mergency, 02 = 
(A)larm, 63 = (W)arning}. The hypothesis with a highest 
priority is Emergency, following by Alarm and then Warn- 
ing. These hypotheses are exclusive and exhaustive, hence 
Shafer’s model holds and we work on power-set: 2° = 
{0,E, A, W,EU A,EU W,AU W,EUAU W}. 
e A rule-base is defined in order to establish the relationships 
between the sounds’ attributes associated with all localized 
sources and corresponding degrees of danger, in the form: 

Rule 1: if attributes-type 1 then Emergency 

Rule 2: if attributes-type 2 then Alarm 

Rule 3: if attributes-type 3 then Warning 
where attributes types 1, 2 and 3 could be specific sounds’ at- 
tributes values, which are informative enough to be processed 
and evaluated for their contribution towards correct alarms’ 
classification. In this rule base attributes-type 1 is a sound’s 
attribute, which is typical for degree of danger Emergency, 
attributes-type 2 is typical for Alarm, attributes-type 3 for 
Warning. In our case the frequency of intermittencies (if the 
signal is intermittent) fint, associated with the localized sound 
sources is utilized. Then the following specific rule-base is 
used as an input interface to map the sounds’ attributes (so 
called observations) obtained from all localized sources into 
non-Bayesian basic belief assignments 7p. (.): 
Rule 1: if fint > 1H z then Mobs(E) = 0.9 and Mobs (E U 
A) =0.1. 
Rule 2: if fing > 5Hz then Mobs (A) = 0.7, Mobs (A U E) = 
0.2 and mops(A UW) = 0.1. 
Rule 3: if fing > OHz then my,(W) = 0.6 and Mobs (W U 
AU E) = 0.4. 
If the value of the sound attribute received is close to the 
particular sound signal parameter for Emergency, our bba 
is constructed in way that it will consider the hypothesis 
Emergency and also the reasonable in this case composite 
proposition (ÆU A), representing a possible partial uncertainty. 
If the value obtained is close to the particular sound signal 
parameter for Alarm, our bba is constructed in way that it will 
consider the hypothesis Alarm itself and also the reasonable in 
that case composite propositions AU Æ and AUW. Assigning 
a higher mass of belief to A U E than to A U W is to take 
care about the possibility for Emergency case. If the value 
obtained is close to the particular sound signal parameter for 
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Fig. 1. Scenario. 


Warning, our bba is constructed in way that it will consider 
the hypothesis Warning and also the composite proposition 
EU AU W, representing the case of full ignorance, in order 
to take care about possibility for Alarm and especially for 
Emergency case. All the belief masses not already assigned to 
singletons (E, A or W) are assigned to the reasonable partial 
uncertainties reflecting the possible noise perturbations in the 
observed information. 

e At the very first time moment k = 0 we start with a 
priori basic belief assignment (history) set to be a vacuous 
belief assignment Mpist( E U AU W) = 1, since there is no 
information about the first detected degree of danger according 
to sound sources. 

e Combination of currently received measurement’s bba 
Mobs(.) (for each of located sound sources), based on the 
input interface mapping, with a history’s bba, in order to 
obtain estimated bba relating to the current degree of danger 
m(.) = [Mnist E Movs|(.). PCR5 and DS are tested in the 
process of temporal data fusion to update bba’s associated 
with each sound emitter. 

e Flag for an especially high degree of danger has to be 
taken, when during the a priori defined scanning period, 
the maximum Pignistic Probability [7] is associated with the 
hypothesis Emergency. 

For security purpose, it is very important to keep updating 
sequentially the estimation one has on the state of the true 
modes of sound emitters, even if they are in the lowest 
priority mode (i.e. in warning mode only) in order to prevent 
unexpected alarm’s changes. 


VI. SIMULATION SCENARIO AND RESULTS 


In our simulation scenario (Fig. 1) a set of three sensors 
located at different distances from the microphone array are 
installed in an observed area for protection purposes, together 
with a video camera [8]. It is assumed, that sensors are 
assembled with alarm devices, as follows: Sensor 1 with 
Sonitron, Sensor 2 with E2S, and Sensor 3 with System Sensor 
companies alarm devices. In case of alarm events (smoke, 
flame, intrusion, etc.) the alarm devices emit powerful sound 
signals with various duration and frequency of intermittence 
depending on the nature of the event. dangerous signal source. 
These sensors are used for the purpose of estimation the 
level of danger/threat for each place where they are located. 
Data, obtained from each source are processed and analyzed 
at particular sensor’s level independently, in consecutive time 
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Fig. 2. Sonitron, E2S, System Sensor Sound Characteristics. 


moments, with regard to all possible degrees of danger: 
0; = (E)mergency, 02 = (A)larm, and 03 = (W)arning. 
Doing this one could find the first suspicious moment, when 


Table 1 Sound signal parameters. 
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Teig = 10s Tig = 30s T sig = 60s 








the situation could become eventually dangerous. 

The sound signals representing Warning, Alarm and Emer- 
gency, emitted from alarm devices, produced by Sonitron, E2S 
and System Sensor companies used in our simulation (Table 
1) are shown on Fig. 2. The first (left) column of Fig. 2 relates 
to Sonitron, the second column to E2S, and the third (right) 
column relates to System Sensor devices. The first row of 
this figure represents the signal 1 for Warning, second row 
represents signal 2, for Alarm, and the last third row represents 
signal 3, for Emergency case. The Alarm signal is intermittent 
with a frequency of intermittence fint = 5Hz and a duration 
Tsig = 308, so called type I. The Emergency sound signal is 
intermittent with a frequency of intermittence fint = 1 Hz and 
duration Tsig = 60s, so called type II. The Warning signal is 
continuous with fing = OH z and Tig = 10s. 

Our simulation scenario considers a true degree of danger 
associated with the sound sources as follows: Emergency mode 
for the first sound emitter, Alarm mode for the second, and 
Warning mode - for the third one. The three sources are pro- 
cessed in parallel and because of possible sound perturbations 
we assume that possible random changes can be observed 
over the scans for a given mode. We therefore introduce 
some switches between the three modes Emergency, Alarm 
and Warning to simulate what can happen in practice (what 
we call ground truth and displayed with black plots on our 
next figures 3 and 4. According to this, three main cases are 
estimated: 


e The most interesting for us it is the estimation of danger 
level by sensor 1, associated with Emergency mode. In 
our simulation, the The Ground Truth associated with 
Sensor 1 considers that during scans 1-3 the observations 
generated support the Emergency mode (the highest level 
of danger). From scan 4 to scan 6 the observations 
generated support the Warning mode (the lowest level 
of danger). From scan 7 to scan 30 the observations 


generated support again Emergency mode. Such kind of 
scenario is important in the real world cases because 
sources data can be deteriorated by noise perturbations 
and therefore some possible conflicts arise between ob- 
servations from scan to scan. We assume that a conflict 
occurs in sounds data between Emergency and Warning 
modes, because it could weaken strongly the decision 
taken. It could become a reason to ignore the significance 
of out of ordinary, dangerous situation. 


e The second interesting case concerns the estimation of 
probabilities of modes, associated with the sound emitter 
2 working in Alarm mode. The Ground Truth has been 
a little bit changed with respect to the ground truth 
simulated for sensor 1. We assume that during scans 
1-3 the observations generated support correctly the 
Alarm mode. From scan 4 to scan 8 the observations 
generated support the Emergency mode because of noise 
perturbations. From scan 9 to scan 30 the observations 
generated support again correctly the Alarm mode. 


e The third interesting case concerns the estimation of the 
probability of modes, associated with the third emitter 
working in Warning mode. In our simulation of this 
case, we considers that during scans 1-2 the observations 
generated support correctly the Warning mode. From 
scan 3 to scan 5 the observations generated support 
the Emergency mode because of some possible noise 
perturbations. From scan 6 to scan 30 the observations 
generated support again correctly the Warning mode. 

As a result of processing and analyzing sounds’ data, 
obtained from the three sources, processed in parallel, one 
establishes at each scan, for each source the Pignistic probabil- 
ities, associated with all the considered modes of danger. The 
decisions should be governed at the video camera level, taken 
periodically, depending on: /) specificities of the video camera 
(time needed to steer the video camera toward a localized 
direction); 2) time duration needed to analyze correctly and 
reliably the sequentially gathered information. We choose as 
a reasonable sampling period for camera decisions Tyee = 
20sec, i.e. at every 10th scan, we should establish the decision 
about the most probable mode of danger, associated with each 
sound source, that way to declare directions for steering the 
video camera. For our scenario, the decisive scans will be 
10th, 20th, and 30th. In the next two subsections we analyze 
the performances of PCR5 and DS to conclude on their ability 
(or inability) to correctly identify the alarm modes for the 
prioritization purpose. 


A. PCRS rule performance for danger level estimation. 


Figure 3 shows the values of Pignistic Probabilities of 
each mode (Emergency, Alarm, Warning) associated with three 
sound emitters (1st source in Emergency mode, (subplot on the 
top), 2nd source in Alarm mode (subplot in the middle), and 
3rd source in Warning mode, (subplot in the bottom)) during 
the all 30 scans. Each source has been perturbed with noises in 
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Fig. 3. PCRS rule Performance for danger level estimation. 


accordance with the simulated Ground Truth, associated with 
particular sound source. These probabilities are obtained for 
each source independently as a result of sequential data fusion 
of Mops(.) sequence using PCRS combinational rule. For each 
source, we analyze the probabilities of its modes obtained 
with BetP computed from PCRS rule and the corresponding 
decisions for steering the camera at scans no. 10, 20, and 30. 
Decision taken by PCRS rule at scan 10: 


For source 1, associated with Emergency mode (Fig. 3, top- 
subplot), Pign.Proba established by PCR5 at scan 10 are as 
follows: BetP(E) = 1.0, BetP(A) = 0, and BetP(W) = 0. 
During the first scans one has BetP(E) < 1 because of the 
impact of the full uncertainty at the beginning. During the tran- 
sition period between scans 4 and 6 the Pignistic Probability 
BetP(£) decreases near to 0.4, and in a meantime Bet P( W) 
increases near to 0.6, reflecting that way the new observations 
supporting the Warning mode. After reestablishing the proper 
sound signal at scan 7, the PCR5 combination rule leads to 
quick re-estimation of belief masses, assigned to the right 
Emergency mode. One sees clearly the efficiency of PCRS to 
detect a mode switch from the sequential fusion of Mobs (.). At 
this processing stage, after decisive 10th scan, PCRS rule takes 
a correct, reliable decision that Bet P(E) = 1.0, assuring that 
camera will steer at this direction with highest priority. 

For source 2, associated with Alarm mode, (Fig. 3, middle- 
subplot), Pign.Proba established by PCR5 are as follows: 
BetP(E) = 0.5, BetP(A) = 0.5, and BetP(W) = 0. At 
first scans, BetP(A) < 1, because of the full uncertainty at 
the very first time moment, and then BetP(A) — 1. During 
the transition time between scans 4 and 8, BetP(A) gradually 
decreases, while BetP(E) gradually increases. During this 
period PCRS rule takes attention according to the mode 
with the highest priority, i.e. the Emergency mode. Starting 
from scan 9 PCRS rule reestablishes gradually (and enough 


quickly after a short delay) the probability mass assigned 
to Alarm mode. At the end of scan 10 PCRS rule keeps 
BetP(A) ~ BetP(E), staying cautious about Emergency, but 
this rule is on the way of fully reestablishing the beliefs in the 
proper Alarm mode for this case and to forget the mistaken 
Emergency mode. 
For source 3, associated with Warning mode, (Fig. 3, subplot 
in the bottom), Pign.Proba established by PCRS are as follows: 
BetP(E) = 0.2, BetP(A) = 0, and BetP(W) = 0.8. Until 
scan 10, because of the sound attributes measurement conflicts, 
the PCRS rule gives some support (non null probability) to 
Emergency mode and also to Warning mode. Until scan 10, 
its behavior is cautious about Emergency mode, and during this 
time period it doesn’t establish a hard decision. PCRS results 
makes sense, because the decision about Warning mode is not 
decisive/firm. 
Decision taken by PCRS rule at scan 20 and scan 30: 
From scan 15 on, and for all sound sources 1,2 and 3, PCR5 
tule estimation is fully adequate and reasonable. 
For source 1, associated with Emergency mode, one has: 
BetP(F) = 1, BetP(A) = 0, and BetP(W) = 0. 
For source 2, associated with Alarm mode: BetP(E) = 0, 
BetP(A) = 1, and BetP(W) = 0. 
For source 3, associated with Warning mode: Bet P(E) = 0, 
BetP(A) = 0, and BetP(W) = 1. 

These Pign.Proba remain firmly one and the same at scans 
20 and 30, associating in stable way the highest priority danger 
to sound source | as expected in such scenario. 


B. Dempster’s rule performance for danger level estimation. 


The corresponding figure 4 shows the values of Pignistic 
Probabilities of each mode (Emergency, Alarm, Warning) 
associated with three sound emitters (1st source in Emergency 
mode, (top subplot), 2nd in Alarm mode (middle subplot), and 
3rd in Warning mode (bottom subplot)) during all 30 scans, 
which are obtained as a result of sequential data fusion of 
Mobs(.) Sequence using DS of combination. 

Decision taken by Dempster’s rule at scan 10: 

For source 1, associated with Emergency mode (Fig. 4, 
subplot on the top), Pign.Proba established by DS are as 
follows: BetP(E) = 1, BetP(A) = 0, and BetP(W) = 0. 
It is obvious, that during the scans 1 and 10 DS is unable 
to respond to the new observations, arriving in scan 4 and 
supporting the Warning mode. DS does not reflect at all the 
new available data, which are informative and should be 
taken into account. This pathological behavior could lead to 
wrong decisions. In our particular case however, DS leads 
to a right final decision at scan 10 by coincidence, but this 
decision could not be accepted as coherent and reliable, 
because it is not built on a consistent logical ground. Taking 
important decisions by chance could be critically wrong and 
could cause valuable damages. 

For source 2, associated with Alarm mode (Fig. 4, middle 
subplot), Pign.Proba established by DS are as follows: 
BetP(E) = 1, BetP(A) = 0, BetP(W) = 0. During 
the scans 1 and 10, because of the conflicts in obtained 
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measurements, DS generates a totally wrong Pign.Proba 
BetP(FE) = 1.0 assigned to Emergency, producing a hard 
decision for Emergency case. DS leads here to false alarm. 
That way video camera will be steered in wrong direction, 
which in reality is not the direction with highest priority. It 
means, that the true most dangerous direction for reaction 
will be ignored. 

For source 3, associated with Warning mode (Fig. 4, subplot 
in the bottom), Pign.Proba established by DS are as follows: 
BetP(E) = 1, BetP(A) = 0 and BetP(W) = 0. Here 
the same false alarm situation is established as in source 2. 
Actually at scan 10 DS establishes totally wrong decisions 
for source 2 and source 3. The only right decision taken 
for source 1 is obtained by coincidence (because of not 
responding behaviour of the rule) and has no logical ground. 


Decision taken by Dempster’s rule at scan 20: 


At scan 20, according to source 1, DS keeps its nonresponding 
behaviour, leading to right, but taken by coincidence decision. 
According to sensor 3 DS keeps the false alarm, as at scan 10. 
It succeeds to take a right decision for source 2, associated 
with Alarm mode, after a longer delay in reestablishing the 
belief masses for Alarm, in comparison with PCRS rule. 


Decision taken by Dempster’s rule at scan 30: 


At this scan DS succeeds to keep the right decision for source 
2. However, it keeps performing as at scan 20, producing right, 
but logically ungrounded decision for source 1, and false alarm 
for source 3. Taking important decisions, concerning security, 
by chance, could be critically wrong and dangerous. Steering 
camera toward wrong direction, on the base of false alarm, 
could become critical too, because that way the proper camera 
response will be mistaken. 
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Fig. 4. Dempster’s rule Performance for danger level estimation. 


VII. CONCLUSIONS 


In this paper the alarms’ identification and prioritization (in 
terms of degree of danger) has been considered and realized 
using PCR5 rule of combination in order to estimate the proper 
degree of danger, especially in crowded scene, where events 
could happen at a set of a priori defined dangerous directions. 
The method utilized is based on the sequential fusion of 
the sound sources information obtained by two-dimensional 
microphone array defining the positions of the sources in 
surveillance area converted into basic belief assignments. A 
comparison of performance of PCRS rule with respect to 
the performance of Dempster’s rule has been done. The 
results obtained show the strong ability of PCR5 rule to take 
care in a coherent and stable way for the evolution of all 
possible degrees of danger, related to all the localized sources. 
It is especially significant in case of sound sources’ data 
discrepancies and conflicts, when the highest priority mode 
Emergency occurs. PCR5 rule prevents to produce a mistaken 
decision, that way prevents to avoid the most dangerous case 
without immediate attention. A similar adequate behavior of 
performance is established in cases of lower danger priority. 
Dempster’s rule shows weakness in resolving the cases exam- 
ined. In Emergency case, Dempster’s rule does not respond 
to the level of conflicts between sound sources, leading that 
way to ungrounded decisions. In cases of lower danger’s 
priority (perturbed Warning and Alarm mode), Dempster’s 
rule could cause a false alarm and can deflect the attention 
from the existing real dangerous source by assigning a wrong 
steering direction to the surveillance camera. In real world 
cases involving a broad surveillance area and multiple located 
sound sources, it becomes very important to realize distributed 
parallel processing with respect to the number of sources, in 
order to have correct decision in the proper time. 
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Abstract—The present research is aimed to: (i) characterize 
the ability of human visual system to define the objects’ slant 
on the base of combination of visual stimulus characteristics, 
that in general are uncertain and even conflicting. (ii) eval- 
uate the influence of human age on visual cues assessment 
and processing; (iii) estimate the process of human visual cue 
integration based on the well known Normalized Conjunctive 
Consensus and Averaging fusion rules, as well on the base of more 
efficient probabilistic Proportional Conflict Redistribution rule 
no.5 defined within Dezert-Smarandache Theory for plausible 
and paradoxical reasoning. The impact of research is focused 
on the ability of these fusion rules to predict in adequate way 
the behavior of individuals, as well as age-contingent groups of 
individuals in visual cue integration process. 

Keywords—Integration of visual stimulus characteristics; 
DSmT; probabilistic Proportional Redistribution rule no.5; Nor- 
malized Conjunctive rule; Averaging rule. 


I. INTRODUCTION 


The visual information about the 3D world utilized by 
humans is provided by a set of 2D images on the eye retina. 
It leads to uncertainty and/or discrepancy in image interpreta- 
tions because the same projections could belong to different 
3D objects. As an additional complexity, the visual system 
has to recover the information about objects’ depth (i.e. the 
mutual disposition of objects) with respect to the observer. To 
overcome these difficulties one needs to utilize and combine 
in an effective way a variety of visual characteristics (or so 
called cues) in order to achieve inferences, more informative 
and potentially more accurate than if they were obtained by 
means of a single cue. The process of combining, manipulating 
and interpreting information in stimulus integration problem 
is beneficial because it allows the human visual system to 
estimate and perceive more accurately the objects’ properties 
and to take appropriate actions, leading to improved reasoning 
(judgment) under uncertainty or/and possible conflicts between 
different visual stimulus. The uncertainty, associated with 
the utilized visual cues and the possible conflicts between 
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them influences the decision making and action control in the 
process of human aging due to the increased level of internal 
noise in the visual system [1]. If the visual system neglects 
some of the available information [2], the visual signal/noise 
ratio will additionally deteriorate. Throughout the life cycle 
many aspects of vision and visual information processing 
decline and affect everyday task performance. 3D shape of 
objects and their spatial layout are specified on the base of 
both: static and dynamic visual cues. Age-related impairments 
in visual processing and perception are observed for both of 
them [3]. Therefore the task of vision inherently requires the 
integration of all available visual cue information to determine 
3D object’s shape. This paper focuses on human way of 
integration of motion and texture information in the process 
of object’s slant estimation. Our goal is to reveal not only 
the age-related changes in the process of visual information 
assessment, but also the plasticity of the visual system to 
best adapt to these changes and to efficiently exploit all the 
available information in the visual scene in order to provide 
the visual system with a meaningful output, concerning more 
accurate and robust spatial information about the 3D objects. 
We will present and compare the performance of three fusion 
rules to model human way of visual cue integration: Normal- 
ized Conjunctive Consensus (NCC), Averaging (AVE), and 
the probabilistic Proportional Conflict Redistribution rule no.5 
(pPCRS) defined recently within Dezert-Smarandache Theory 
(DSmT) for plausible and paradoxical reasoning. In section II 
we present briefly the visual cue integration problem and recall 
the principles of NCC, AVE and pPCRS fusion rules. In sec- 
tion III, we present the experimental strategy and procedure, 
methods, subjects, involved in the experiments, stimulus and 
used apparatus. In section IV, the research reasoning logic is 
presented as well the results, obtained on the base of applied 
fusion rules. Concluding remarks are given in section V. 


II. VISUAL CUE INTEGRATION FOR SLANT ESTIMATION 


Vision provides a number of static and dynamic cues to 
the 3D layout of observed objects and scenes. Human show 
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individual differences in their abilities to utilize these cues for 
judgments. The first source of visual information considered 
in this paper relates to 2D texture variations in the projection 
of a slanted plane. The texture elements alter their form and 
the degree of shortening depends on their relative position 
in the plane and relative orientation to the observer - the 
shortening of element form and the texture density are highest 
in the direction of plane’s tilt. The degree of texture variation 
depends on surface slant and it is biggest in the most distant 
plane areas with respect to the observer. Another source of 
visual information considers the object’s motion relative to the 
observer. The gradient of velocity in two orthogonal directions 
contains information about the object’s slant and tilt. When 
both static and motion information is available, the efficient 
way of combining data, provided by them, leads to more 
accurate and robust estimation of object’s geometry and to 
better understanding and recognition of the surrounding scenes 
and objects. The common ideas for visual cue combination 
in order to specify the viewer-dependent object’s character- 
istics rely on the assumption of cue independence. There 
are various methods for modeling the visual cue integration 
process. Bayesian inference [4], [5] is a classical approach for 
modeling and processing probabilistic information. An ideal 
Bayesian observer was used to define the optimal weighting 
and combination of redundant visual cues [8], [9]. The main 
difficulties applying it concern the need of measurements’ 
statistics and knowledge about the a priori information. The 
Bayesian framework was applied for modeling the spatial 
integration of auditory and visual information [6], for visual 
and haptic integration [7] where the main idea is that the 
human brain combines visual cues to obtain the most reliable 
estimate of the state of the world, i.e. the estimate in which the 
variance of the resulting combined cue is minimized. As it will 
be shown in our research, this kind of integration, being very 
sensitive to the sources with the bigger means, neglects part of 
available information, which is very unsatisfactory behavior in 
cases of combining conflicting visual cues. Generally visual 
data are not only inaccurate, incomplete and uncertain, but 
even conflicting, because the observer moves, or the surfaces 
could change their orientation in the particular scene, or one 
object occludes the other. All these data particularities must 
be incorporated in the process of human visual perception in 
order to provide a complete and accurate model of the real 
world and to improve the decision accuracy. In our study we 
will apply and compare the performance of three fusion rules: 
NCC rule, pPCR5, and AVE fusion rules to model the human 
process of visual cues integration. 


A. Normalized Conjunctive Consensus rule 


The Normalized Conjunctive Consensus (NCC) rule is used 
to combine simultaneously assumed independent visual cues. 
In case considered, the information obtained by the available 
visual cues is characterized by Gaussian likelihood functions 
with given means p;,2 = 1,2,.. and standard deviations 
ci, i = 1,2,.., defining the uncertainty encountered in data. 
In case of two independent visual cues with one-dimensional 
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It is characterized with a mean, biased toward the function 
with the bigger of the two means, similar to Bayesian estima- 
tor. It is optimal (minimizes the variance of the error estima- 
tion), when the original distributions have close mean values. 
When two visual cues are in conflict, however, (characterized 
with distant distributions), NCC rule leads to neglecting (not 
utilizing) part of the available information, because the source 
with the bigger mean is weighted more heavily. In this case 
it is reasonable to keep the original distributions in the fused 
probability density function until it is possible to make reliable 
decision. This has been done by pPCRS5 fusion rule defined in 
DSmT . 


B. Probabilistic Proportional Conflict Redistribution rule no.5 


The general principle of all Proportional Conflict Redis- 
tribution rules [10], Vol.3 is to: 1) calculate the conjunc- 
tive consensus between sources of evidence (different visual 
cues) 2) calculate the total or partial conflicting masses; 3) 
redistribute the conflicting mass (total or partial) proportion- 
ally on non-empty sets involved in the model according to 
all integrity constraints. The recently proposed non-Bayesian 
probabilistic Proportional Conflict Redistribution rule no.5 
[11] is based on the discrete Proportional Conflict Redistri- 
bution rule no.5 (PCR5) [10], Vol.3, for combining discrete 
basic belief assignments. For completeness, we will discuss 
in brief the main idea behind the discrete PCR5. It comes 
from the necessity to deal with both uncertain and conflicting 
information, transferring partial or total conflicting masses pro- 
portionally only to non-empty sets involved in the particular 
conflict and proportionally to their individual masses. Basic 
belief assignment (bba) represents the knowledge, provided 
by particular source of information about its belief in the true 
state of the problem under consideration. Given a frame of 
hypotheses © = {01,... 0n}, and the so called power set 
29 = {0, 01, ..., On, 01 U82, «.., 91 U82U...U8n}, on which the 
combination is defined, the general basic belief assignment is 
defined as a mapping m,(.) : 2° — [0,1], associated with 
the given source of information s, such that: m.(Ø) = 0 
and $` yezo ms(X) = 1. The quantity m,(X) represents the 
mass of belief exactly committed to X. Under Shafer’s model 
assumption of the frame © (requiring all the hypotheses to 
be exclusive and exhaustive), the PCR5 combination rule for 
only two sources of information is defined as: mpcrs(0) = 0 
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and YX € 2° \ {0} 


Mpors(X) = m42(X)+ 
my(X)?m2(Y) 
mon mı(X)+m2(Y) 
XnNY=0 


m9 (X)?m, (Y) 
ma(X) + mi(Y) 
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All sets involved in the formula are in canonical form. The 
quantity ™m 2(X) corresponds to the conjunctive consensus, 
i.e: m42(X) = D xi xe22 mı(Xı)mə(X2). All denom- 


X10X2=X 

inators are different from zero. If a denominator is zero, 
that fraction is discarded. No matter how big or small the 
conflicting mass is, PCR5 mathematically does a proper redis- 
tribution of the conflicting mass since PCR5 goes backwards 
on the tracks of the conjunctive rule and redistributes the 
partial conflicting masses only to the sets involved in the 
conflict and proportionally to their masses put in the conflict, 
considering the conjunctive normal form of the partial conflict. 
PCRS is quasi-associative and preserves the neutral impact 
of the vacuous belief assignment. The probabilistic PCR5 
is an extension of discrete PCR5 version to its continuous 
probabilistic counterpart. Basic belief assignment, involved in 
discrete PCR5 rule is extended to densities of probabilities of 
random variables. For two independent sources of informa- 
tion with given Gaussian distributions pı (x) and pə(x), the 
obtained combined result becomes [11]: 


pi(x)pa(y) i 
pi(x) + poly) 
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The behavior of pPCR5 fusion rule in comparison to NCC 
rule (1) could be characterized by two cases below: 
Case 1: both densities pı(x) and pə(x) are close (Fig.1- 
case 1). The combined density acts as an amplifier of the 
information by reducing the variance. Here pPCR5 acts as 
NCC fusion rule. 
Case 2: the densities pı (x) and pọ(x) are distant (Fig. 1-case 
2). Then the combined density keeps both original densities 
(not merging both densities into only one unimodal Gaussian 
density as NCC rule does), avoiding to neglect a part of the 
available information. 

This new (from a theoretical point of view) property is very 
interesting and it presents advantages for practical applications 
as it will be shown in our particular research. Application of 
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Fig. 1. Performance of pPCRS fusion rule vs. NCC rule. 





Fig. 2. Texture types: (left) dots, (right) lines. 


pPCR5 fusion rule assures robustness to the potential errors 
and allows taking more reliable and adequate decisions in the 
process of integration of different cues in visual perception. 


C. Averaging rule 


The discrete simple Averaging rule consists in a simple 
arithmetic average of belief functions associated with sources 
of information (in our case particular visual cues). For given 
two sources of information defined with their discrete bba’s: 
mı(.) and mo(.), for VX € 2° \ {Ø}, the combined distri- 
bution based on this rule becomes may e(X) = $(m1(X) + 
m2(X)). This trade-off rule is commutative, but not associa- 
tive. In case of two independent and equally reliable/trustful 
visual characteristics, associated with Gaussian distributions: 
pı(x) and p(x), the combined distribution based on Averag- 
ing rule becomes: 


pave (a) = 3 (P1(2) + pa(c)) a 


III. EXPERIMENTAL GOAL, METHODS, AND PROCEDURE 


The experimental goal is directed to: (i) characterize the 
ability of human visual system to define the objects’ slant on 
the base of only single cue available: Texture Information Only 
(referred as TIO case) or Motion Information Only (referred 
as MIO case), as well as in the case of both Texture and 
Motion information (referred as TM case), since human show 
significant individual differences in their abilities to combine 
and utilize both texture and motion information for judgments; 
(ii) evaluate the influence of human age on the assessment of 
objects’ characteristics using available visual information. 


A. Observers 


Twelve elderly (mean age 74 years, range 67-85 years) 
and twelve younger (mean age 21 years, range 18-25 years) 
subjects took part in the experiments. All of them have passed 
eye examination. None of them reported having any major 
health problems. 


B. Stimuli 


The stimuli represent two slanted textured planes that form 
a symmetric horizontal dihedral angle. Two types of textures 
were rendered over the planes: dots (Fig.2-left) or a texture of 
non-intersecting lines (Fig.2-right). 

Nine different sizes of the dihedral angles were used: 20 
deg, 35 deg, 50 deg, 65 deg, 80 deg, 95 deg, 110 deg, 125 
deg, and 140 deg. To change the size of the dihedral angle, the 
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slant of the two planes that hinged together was changed by 
an equal amount. One static and two dynamic conditions were 
generated. In all conditions the dihedral angle was presented in 
the middle of a computer screen under perspective projection 
and its vertical dimensions were fixed to 7 degrees of visual 
angle. In the static condition (TIO case) the information 
about the surface slant and consequently about the size of the 
dihedral angle is provided only by the changes in the texture 
over the planes. In the dynamic conditions (MIO case) the 
dihedral angle translated horizontally leftwards or rightwards 
with a speed of 6.4 deg of arc/s. It changed direction on every 
1.1 s. In one of these conditions the texture specifies a flat 
object and thus, the information about the surface slant and 
the size of the dihedral angle is provided only by the motion. 
To achieve this, the texture coordinates were calculated relative 
to the eye coordinate system and they did not vary with the 
relative depth of the planes forming the dihedral angle. In the 
other dynamic condition (TM case) both the texture variation 
and the velocity of the object parts depend on the relative 
depth and therefore both specify the surface slant and the size 
of the dihedral angle. 


C. Apparatus 


The stimuli were presented on 21” Dell Trinitron monitor 
with Nvidia Quadro 900XGL graphic card. The monitor 
resolution was 1600 x 1200 pixels and the refresh rate was 85 
Hz. The stimuli were rendered on the screen using OpenGL. 
Grayscale images with 8 bit precision (256 colors) were used. 
The monitor was gamma-corrected using a lookup table. 


D. Experimental Procedure 


The observer sat in semi-illuminated room at a distance 
of 114 cm from the computer screen. The method of single 
stimuli was used. On every trial the observers had to compare 
the stimulus with an internal standard - a right dihedral 
angle. The task of the observers was to evaluate whether 
the presented dihedral angle was larger/smaller than a right 
angle. Each subject participated in 6 sessions. The sessions 
differed by the experimental condition and the texture type. 
The order of the experimental sessions was contra-balanced 
across observers. In every experimental session the 9 different 
values of the dihedral angle were presented in random order 
30 times. Each experimental session started with a demo to 
familiarize the subjects with the texture types (Fig. 2) used in 
the study and the way the texture changes in the different 
experimental conditions. The proportion of responses ”the 
dihedral angle is larger than the right angle” is estimated for 
all different experimental conditions and for each subject the 
resulting psychometric functions are obtained. For example 
the observed psychometric function, associated with the first 
tested young subject for the case TM is given in Table I. 

All subjects passed a priori training session of 60 trials 
in which a particular checkerboard pattern (Fig.3-left) was 
used to texture the dihedral angle under perspective projection 
(Fig.3-right). It helps the subjects to get familiar with the task 
to perform. The results of training were not taken into account. 





Fig. 3. 
(right). 


Checkerboard pattern (left), Angle under perspective projection 


The dihedral angle remained visible on the screen until an 
answer was received. To give response the younger subjects 
used the buttons of a computer mouse while the elderly gave 
an oral response that was recorded by the experimenter. 


IV. EXPERIMENTAL AND RESEARCH LOGIC 


Once having all psychometric functions, obtained for all 
different experimental conditions and for each subject in age- 
contingent groups, we should answer several questions: 
Question 1. What is the effect of texture (dots, lines ) in MIO 
case? Does the manipulation of the texture we applied succeed 
to eliminate it’s effect in MIO case? 

Question 2. How human observers combine the visual cues in 
order to estimate surface’s slant? Do they base their responses 
on a single cue (MIO or TIO) or on their combination TM? 
If a single cue is used, which one - TIO or MIO is more 
informative? 

Question 3. What combination rule (NCC, pPCR5, or AVE) 
used to combine available visual cues predicts more adequately 
human’s way of cue integration? 

Question 4. What is the common trend, concerning the 
visual cue combination performance of both age-contingent 
groups, i.e the performance of the so-called averaged-people, 
associated with each group. We denote these trends as: trend 
of averaged-young-people and respectively trend of averaged- 
old-people. They are based on combined individual behav- 
iors in particular group, reveling its intrinsic behavior as a 
whole, reducing uncertainties associated with individual per- 
formances. All the tested subjects in age groups are considered 
as independent and equally reliable sources of information, 
because each subject provides his/her own psychometric func- 
tion, associated with the TM combination process and should 
be taken into account in equal rights to derive these trends. 
Our goals are: (i) to find out which combinational rule (NCC, 
pPCRS5 or AVE) is able to model correctly and adequately such 
human age-contingent group trends in reasoning process; (ii) 
to analyze the special features, characterizing these trends. 


TABLE I 
EXAMPLE OF PSYCHOMETRIC FUNCTION. 





Angle’s Value 20 35 50 65 80 95 110 125 140 








Answers 
angle> 90deg 0 0.12 0.17 0.73 0.9 0.95 0.98 1 1 
over 30 times 
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V. RESULTS 


The experimental psychometric functions for both age 
groups and for all experimental conditions were compared 
using the pfcmp extension of MATLAB toolbox psignfit [14]. 
It implements a maximum-likelihood method [12] for fitting 
the psychometric functions and compares the parameters of 
the fits when estimated from the separate data sets and when 
the two sets are combined. As a result the significance value 
p is produces as a measure of fit between the examined 
psychometric functions. 

e Results concerning Question 1 stated in Section IV. 

The results show that in MIO case the effect of the texture’s 
type (line or dots) is effectively eliminated - for 10 out of 
12 observers in each age group the null hypothesis of equal 
psychometric functions for both texture types could not be 
rejected at the assumed reliability level of p = 0.05. For 
static TIO case the comparison of the psychometric functions 
obtained for line texture and for dots texture for both age 
groups, shows that the null hypothesis could not be rejected 
at p = 0.05 for only 3 subject in each age group. These 
results suggest that the differences in the texture’s type affect 
the subjects’ performance significantly more in the static 
case. The smaller effect of the texture’s types in MIO case 
provides indirect evidence that in these conditions the subjects’ 
performance is based on the motion information. 

e Results concerning Question 2 stated in Section IV. 

In order to answer this question, we have analyzed and 
compared the experimental psychometric functions obtained 
for each subject in both age-contingent groups given the 
following cases: 

e {dots-based TIO vs. dots-based MIO vs. dots-based TM} 
e {line-based TIO vs. line-based MIO vs. line-based TM} 

Older people rely more on the static information, especially 
in case of dots texture type. Five out of 12 subjects do not 
show significant difference (p = 0.05) in their performance 
for the TIO and TM case for dots texture, and 4 out of 12 
subjects - for line texture. Young people rely more on the 
dynamic information: the psychometric functions for MIO and 
TM case do not differ significantly at p = 0.05 for 5 out of 
12 subjects for dots texture and 4 out of 12 - for line texture. 
e Results concerning Question 3 stated in Section IV. 

In order to answer correctly this question we should evaluate 
the performances of applied combinational rules in the process 
of visual cue integration to predict the model of human fusion 
performance on the base of theoretically predicted psychome- 
tric functions. A comparison between experimentally obtained 
and predicted psychometric functions for all tested cases is 


provided on the base of goodness-of-fit test [13], one important 
“a (Oj z2 

where y? is an index of the agreement between an ob- 
served(O)/experimental and expected(E)/predicted via partic- 
ular fusion rule sample values of psychometric function. For 
our case J = 9 represents the number of test angle values. 
The critical value of the test for v = J — 1 = 8 degrees of 


freedom at assumed p = 0.01 is x? = 13.36 [13]. This test is 


application of chi-squared criteria: x? = >> 


391 


TABLE II 
CHI-SQUARED VALUES FOR OLDER SUBJECTS. 





Subject dotsTM pPCRS5 / AVE 


1 0.0653 / 0.0586 
0.8415 / 0.9015 
0.2359 / 0.2547 
0.6995 / 0.6876 
0.3618 / 0.3031 
0.1066 / 0.1304 
0.1859 / 0.1901 
1.6944 / 1.7958 
0.1697 / 0.2078 
10 0.0368 / 0.0561 
11 0.0909 / 0.0709 
12 0.2664 / 0.2564 


dotsTM NCC 


0.1359 
3.7232 
0.4360 
3.9117 
0.3751 
0.1387 
0.1935 
5.2330 
0.8814 
0.0566 
0.1021 
1.1320 


lineTM pPCR5 / AVE 


0.1775 / 0.2032 
0.0694 / 0.0663 
0.0827 / 0.0934 
0.1380 / 0.1461 
0.2982 / 0.3098 


line TM NCC 


0.5159 
0.0796 
0.1373 
0.1522 
0:3927 
0.2261 
0.4306 
0.4585 
1.5113 
0.1519 
0.0851 
1.5873 








0.1735 / 0.1943 
0.1881 / 0.2101 
0.3813 / 0.3114 
1.0045 / 1.0062 
0.1391 / 0.1411 
0.0577 / 0.0499 
0.1798 / 0.1682 


GONDAR N 























applied for both texture’s types (dots and line) to the following 
pairs of psychometric functions: 

e {MT case(experimental) - MT case (NCC rule)} 

e {MT case(experimental) - MT case (pPCR5 rule)} 

e {MT case(experimental) - MT case (AVE rule)} 

In general, the results show that the pPCR5 and AVE 
fusion rule predict more adequately than NCC rule human 
performance in all experimental cases. The differences be- 
tween the experimental and estimated via pPCR5 and AVE 
rules psychometric functions for all observers in both age 
groups are smaller than those, obtained by NCC rule. For 
older subjects (Table II) all fusion rules predict psychometric 
functions that do not differ significantly from the experimental 
ones, but the differences in the fits are smaller in case of 
pPCR5 and AVE rules than in case of NCC rule application. 
For younger subjects (Table II), however, the NCC rule does 
not predict adequately the performance of the subjects in some 
conditions. For Subjects no. 5 and no. 6 (dots-based TM case) 
and for Subjects no. 4 and no. 9 (lines-based TM case) the 
obtained values (put in bold in Table III) significantly exceed 
the critical value of 13.36. The graphical results reflecting 
younger subjects’ no. 4 and no. 9 fusion behaviors in line TM 
case are shown in Fig. 4. These results reflects the situations, 
when the experimentally obtained psychometric functions, 
associated with single cues (TIO and MIO) are characterized 
with distant underlying Gaussian distributions. In this case 
pPCRS5 and AVE fusion rules make predictions, which model 
more correctly and adequately human fusion behavior. They 
are almost similar, but pPCR5 rule performs better than AVE 
rule in these conflicting cases. In the integration process, 
based on NCC rule however, part of available information was 
neglected, because the visual cues with bigger means were 
weighted more heavily (as it was described in Section II A.). 
e Results concerning Question 4 stated in Section IV. 

In order to evaluate the common trend in the performance 
of both age groups, we started with the assumption that the 
tested subjects within each group are independent individual 
sources of information/answers and all of them are equally 
reliable. The results obtained for experimental and estimated 
(via different fusion rules) trends, concerning the visual cue 
combination groups’ performance are presented in Fig. 5: 
subplots 7, 3 for older group, and subplots 2, 4 for younger 
one. Subplots 7 and 2 show results for line texture’s type and 
subplots 3 and 4 - for dots texture’s type. 

In order to compare the performance of different fu- 
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TABLE III 
CHI-SQUARED VALUES FOR YOUNGER SUBJECTS. 















































Subject dotsTM pPCRS / AVE dotsTM NCC lineTM pPCRS / AVE lineTM NCC 

i 0.2976 7 0.3011 0.8526 0.0218 7 0.0191 0.0258 

2 0.0801 / 0.0932 0.1456 0.1264 / 0.1525 0.6591 

3 0.2182 / 0.2076 0.2690 0.1157 / 0.1201 0.1347 

4 1.4509 / 1.4432 1.4716 0.6354 / 0.6523 57.4916 

5 8.1655 / 8.1762 45.1458 1.4695 / 1.4512 2.4105 

6 3.2425 / 3.3195 34.1458 0.1953 / 0.2003 12.2206 

7 0.0014 / 0.0021 0.0079 0.2810 / 0.2957 0.9054 

8 0.9201 / 0.8925 6.6588 0.3542 / 0.3513 0.9365 

9 0.4950 / 0.4861 0.5160 0.8665 / 0.9341 87.1105 

10 0.7633 / 0.7527 0.8304 0.1554 / 0.1599 0.1927 

11 0.4202 / 0.4259 0.4380 0.3949 / 0.3901 0.3977 

12 0.6371 / 0.6458 4.4540 0.0532 / 0.0525 0.2447 

TABLE IV 
CITY BLOCK ERRORS BETWEEN EXPERIMENTAL AND PREDICTED TRENDS. 
PCRS | NCC | AVErage 

lineTM Older group 0.03 0.10 0.04 
dotsTM Older group 0.06 0.11 0.04 
lineTM Younger group 0.02 0.12 0.04 
dotsTM Younger group 0.02 0.11 0.03 




















sion rules in estimating common trends’ prediction the 
city-block errors between the corresponding pairs averaged- 
young/old-people MT(experimental) - averaged-young/old- 
people (NCC/pPCRS5/AVE rules) for both texture’s types are 
given in Table IV. Results show ultimatively that experimen- 
tally obtained trends and those, based on pPCR5 and AVE 
fusion rules are very closed and for both age-contingent groups 
are two times less then those, obtained via NCC fusion rule. 
pPCRS5 and AVE rules predict more correctly the human model 
of reasoning, than NCC rule. pPCRS5 performs a little bit better 
than AVE rule, utilizing all the available information (TIO 
and MIO), even in case of conflict. NCC based trends are 
very sensitive to the sources (different subjects’ psychometric 
functions) with the bigger means, neglecting that way part of 
the available information and acting as an amplifier of the 
information by reducing the variances. 
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Fig. 4. Experimental and predicted performance for subject no.4 and no.9. 
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Fig. 5. 
Groups. 


Experimental and Predicted Trends in Performance of Age-related 


VI. CONCLUSIONS 


The results obtained in this study show age-related differ- 
ence in the performance of the subjects in estimating the three- 
dimensional shape of the objects based on the texture and 
motion information. The task of the observers used in the 
study required the estimation of surface slant - a viewpoint 
dependent characteristic of the visual stimulation that is im- 
portant for visual navigation and for object manipulation. Our 
data suggest that the younger people are more sensitive to 
differences in surface slant, but in the same time they are less 
accurate in their estimates. This cannot assure the robustness 
according to the potential errors during the experiments and 
leads to decisions which are less reliable than those taken 
by older people. Younger people as a group rely mainly on 
motion information neglecting the texture one. Elder people 
are characterized with less sensitivity to difference in the 
spatial characteristics of the three-dimensional objects in the 
real world, but they used to compensate this drawback by 
higher accuracy in their answers. Naturally this leads to ability 
to utilize correctly most of available stimulus information and 
then to improve the decision accuracy. The performance of 
both age groups in combining static and dynamic information 
is better described by the pPCR5 and AVE rule. In comparison 
to NCC rule, especially in conflicting cases pPCR5 fusion 
tules utilizes not only all available stimulus information, but 
this is achieved irrespective of the texture type (line or dots). 
That way pPCR5 fusion rule assures preserving the richness 
of stimulus data in the process of visual stimulus combination. 
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ABSTRACT 


Over the years, there have been many proposed methods in set-based tracking. One example of set-based methods is the use 
of Dempster-Shafer (DS) techniques to support belief-function (BF) tracking. In this paper, we overview the issues and 
concepts that motivated DS methods for simultaneous tracking and classification/identification. DS methods have some 
attributes, if applied correctly; but there are some pitfalls that need to be carefully avoided such as the redistribution of the 
mass associated with conflicting measurements. Such comparisons and applications are found in Dezert-Smarandache 
Theory (DSmT) methods from which the Proportional Conflict Redistribution (PCR5) rule supports a more comprehensive 
approach towards applying evidential and BF techniques to target tracking. In the paper, we overview two decades of 
research in the area of BF tracking and conclude with a comparative analysis of Bayesian, Dempster-Shafer, and the PCR5 
methods. 


Keywords: Dempster-Shafer, Belief Functions, DSmT, Target Recognition, Classification, & Identification, Tracking 


1. INTRODUCTION 


Humans and machines are typically trained for specific missions and/or scenarios [1]. One such case is classification of a 
moving target [2]. To integrate the benefits of human reasoning with machine methods, popular techniques of information 
fusion [3], target tracking [4], and pattern classification [5] are used. 


When the human approaches the target, either the target is moving, the human is moving, or both are moving [6]. 
Cognition, the act of directing attention to sensory information, can be used by the human to fuse track and identify (ID) 
information as a perception of a set of moving targets. Dynamic cognitive multitarget-multisensor fusion under uncertainty 
requires target selection which can be formulated as a belief filtering problem in which sensed target states and identities 
are represented as current situational beliefs. The objective of the human is to 1) abstract number of tracks from the 
tracking environment, 2) assess confidence levels from the target classification algorithm, and 3) integrate the information 
for real-time beliefs of the number and types of targets from a plausible set of targets through an interactive display [7]. 


Multitarget tracking and ID is a subset of sensor fusion, which includes selecting sensors [8], sensor recognition policies, 
and tracking algorithms for a given set of mission requirements [9] for situational awareness [10, 11]. For example, in a 
typical tactical aircraft, the onboard sensors are active radar, electro-optical/infrared (EO/IR), and navigation sensors, with 
each sensor having a variety of modes in which it can operate and features it can detect. Figure | shows the case of EO/IR 
targets. The EO/IR sensor makes kinematic measurements to detect, track, and classify objects of interest while reducing 
user workload. In a dynamic and uncertain environment, a sensor manager, such as a human, must fuse the track and 
classification information to ID the correct target at a given time and can aid tracking algorithms by determining a set of 
tracks to follow and aid classification algorithms by constraining the set of plausible targets. 





Figure 1. (a) Electro/Optical (EO) image and (B) Infrared (IR) image of targets [12]. 
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Multitarget tracking in the presence of clutter has been investigated through the use of data association algorithms [1]. 
Likewise, other multisensor fusion algorithms have focused on tracking targets from multiple look sequences [13]. One 
inherent limitation of current algorithms is that the number of targets needs to be known a priori. While tracking 
algorithms can speculate on the number of targets, the cognition of the number of targets can be afforded to the user. The 
human is presented both the tracking information and the accrued evidence for each target type as a confidence measure. 
Once the human has an ID belief in the number of targets, the tracking algorithm can be updated. The human must then 
cognitively determine from the set of targets how many, what type, and which target goes with which track [14]. 
Additionally, the human has the ability to eliminate those targets that are not plausible which reduces the number of tracks 
and the set of pose templates from which the classification algorithm must search. 


This paper presents a summary of DS methods in the last two decades. By introducing the operator or image analyst for 
cognitive fusion, it may offer a means to control some aspects of the computational burdens experienced by analytical data 
association techniques while improving track quality for multitarget tracking and ID in the presence of clutter. Section 2 
overviews many applications of DS tracking. Section 3 describes DST methods while Section 4 compares the DST 
methods to Bayesian formulations. Section 5 presents a contemporary approach using the proportional conflict 
redistribution rule (PCRS) which is compared to the other methods in Section 6. Section 7 draws some conclusions. 


2. TRACKING METHODS USING DEMPSTER-SHAFER THEORY 


One of the earliest known works in applying Dempster-Shafer (DS) methods to target tracking was by Jean Dezert for 
navigation [15], where the sensor is moving and the targets are stationary. The emergence of the benefits of DS methods 
were applied by Robin Murphy for robotic scene analysis [16]. Building on Murphy’s work, the DS methods were then 
applied to other robotic applications [17, 18, 19], albeit real-time control was still superior with Bayesian methods. At the 
same time, Johan Schubert applied DS methods for determining the number of tracks through the support and plausibility 
functions for the linking of submarine targets between tracks [20]. 


A few tracking and identification algorithms have been proposed for air-target tracking [21] and ground target tracking 
(22, 23] that extend Bayes’ rule for identification where the most probable target is selected when there is incomplete 
knowledge. For instance, there are times when unknown targets might be of interest that are not known at algorithm 
initiation. At other times, there are unknown number of targets to track or targets not trained for classification. One way to 
study the problem is the interaction of the human and the machine working synergistically - since the sensors are 
extensions of the human’s processing. The set theory approach to HRR target classification was proposed by Mitchell and 
Westerkamp and termed a Statistical feature based classifier (STaF) for air-target tracking [24]. In addition, Blasch [25, 
26, 27] presented a feature-based set-theory approach for ground target tracking. In both cases, classification and tracking, 
a set of features and a set of targets was investigated by extending the STaF algorithm as a belief filter for radar profiles 
analysis from which a plausible set of tracks and targets are made available to the user at each time instant [28]. 


Given the ability to track individual targets using advances in DS theory for target identify and classification, methods 
were then developed for tracking a group of targets. In this case, the like targets could be grouped together based on 
common characteristics [29, 30, 31]. Additionally, Li [32] investigated methods for convex optimization for enhanced ID 
processing. Using a combination of belief filtering and data association improved analysis of maneuvering targets [33, 34]. 
Other group tracking methods were postulated for cluster-to-track fusion [35]. Finally, the benefits of DS methods 
provided complementary information to tracking through Kalman weighting [36], mutual aiding [37] and pose estimation 
[38]. 


Beginning in 2005, efforts were made to extend traditional DS tracking methods [39] to that of advanced techniques using 
the proportional conflict redistribution rule (PCR5) [40]. Other tracking methods included multisensor [41], activity 
analysis [42], and out-of-sequence methods [43]. Also, at that time, methods of combining DS with nonlinear tracking 
methods such as the particle filter [44] and the unscented Kalman filter [45] were developed. Finally, a fusion rule based 
on DS methods was used to solve the association problem in target tracking [46]. 


With the demonstrated performance of many applications of DS techniques, research continued in the exploration of DS 
tules for classification to improve track accuracy [47] and maneuvering targets [48]. Multisensor techniques were applied 
for heterogeneous sensor measurements [49], such as DS tracking with unattended ground sensor measurements [50]. 
Currently, efforts are sought for tracking performance evaluation with DS techniques for tracking and identification 
improvement [51, 52, 53]. Further assessment includes combinations with non-linear tracking methods [54] and 
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association of track segments [55]. More recent results have applied the developments from DS tracking from radar 
towards that of image processing [56, 57]. Throughout the many demonstrations of the successes of DS tracking, we now 
discuss DS methods, compare DST with Bayesian methods, and conclude with belief functions (BF) such as the 
contemporary PCRS method for target tracking. 


3. BASICS OF DEMPSTER-SHAFER (DST) THEORY 


The Dempster-Shafer (DS) theory of evidence was devised as a means of dealing with imprecise evidence [58, 59] and has 
been applied to target classification [60]. Evidence concerning an unknown target is represented as a nonnegative set 
function m : P(U) — [0,1], where P(U) denotes the set of subsets of the finite universe U such that m(@) = 0 and X¢_7 m(S) 
= 1. The set function m is called a mass assignment and models a range of possible beliefs about propositional hypothesis of 
the general form Pg £ “object is in S1” where m(S) is the weight of belief in the hypothesis Py. The quantity m(S) usually 
interprets as the degree of belief that accrues in S, but to no proper subset of S. The weight of belief m(U) attached to the 
entire universe is called the weight of uncertainty and models our belief in the possibility that the evidence m in question is 
completely erroneous. The quantities 


Bel,,(S) 4 2, mO) (1) 
OcS 
PILOS È mO) (2) 
ONS#SD 


are called the belief and plausibility of the evidence, respectively and m(O) is the mass assignment for object O. The 
relationships Bel,,(S) < P/,,(S) and Bel,,(S) = 1 — Pln (S°) are true identically and the interval [Bel,,(S),P/,, (S)] is called the 


interval of uncertainty, ([OUm). Knowing that Bel,,(S) —> [0,1] and P/,,(S) — [0,1], three relationships exist. The first is a 
direct use of Bel,,(S) to accept and P/,,(S) to reject measurements. Another method is to use the interval of certainty (IOC) 
defined as [1, 1] — [Bel,,(S), P/,,(S)] using interval subtraction. For example, [1, 1] - [0.8,0.9] = [1-0.9,1-0.8] = [0.1, 0.2]. 
Using the lower bound Bel,,(S) and upper bounds P/,,(S) of the interval, we can assign a confidence measure Cm = 1 + Bel, 
(S) - Pla(S) = 1+0.8—-0.9=0.9. Finally, the JOUm = [Bel,,(S), Pl,,(S)] can be mapped to a Gaussian distribution for belief- 


based track tracking, using the JOUm center as the mean, u, rescaling the bounds to a Normal distribution and tacking the 
estimates (mean and variance). For example, 44m = (0.8+0.9)/2 = 0.85. As another example, assume low belief with many 
measurements plausible, then Bel,,(S) = 0.3 and Pl,,(S) = 0.9, where Cm = 0.6 is lower and the mean is 4m = 0.6. 


The mass assignment can be recovered from the belief function via the Mobius transform: [61] 


m(S)2 > (-1) 8°? Bel, (O) . (3) 


OcS 


The set intersection quantity 


manO ir E mK) VO) (4) 


XOY=S 





is called Dempster’s rule of combination, where K £ > m(X(O)) n(¥(O)) is called the conflict between the evidence m 
XON ¥(O) = @ 
and evidence n. 


In the finite-universe case, the Dempster-Shafer theory (DST) coincides with the theory of independent, non empty subsets of 
U (see [62, 63]; or for a dissenting view, see [64]). Given a mass assignment m, it is always possible to find a random subset 
2 of U such that m(S) = p(Z = S). In this case, Bel,,(S) = pÈ c S) = A(S) and PLS) = PÈ A S # 0) = ps(S) where 2 and 


ps are the belief and plausibility measures of X, respectively. Likewise, we can construct independent random subsets, x, A 
of U such that m(S) = p(È = S) and n(S) = p(A = S) for all S c U. Then, it is easy to show that 
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(m® n\(S)=p(ZAA|ZAA#0) (5) 


for all Sc U. Thus, an intersection of overlapping sets can be fused to generate a global set confidence, where confidence is 
defined on the range [1, 1] - [Bel,,,P/,,] and uncertainty is [Bel,,,P/,,]. As comparative to the optimal approach, the next 


section provides a comparison of the set theory approach to that of traditional Bayesian analysis. 


4. DEMPSTER-SHAFER VERSUS BAYESIAN THEORY 


Recently, Dezert [65] has shown that Dempster’s rule is consistent with probability calculus and Bayesian reasoning if and 
only if the prior P(X) is uniform. However, when the P(X) is not uniform, then Dempster’s rule gives a different result. Both 
Yen [66] and Mahler [67, 68] developed methods to account for non-uniform priors. Others have also tried to compare Bayes 
and DST methods [69, 70, 71, 72, 73, 74, 75]. Assuming that we have multiple measurements Z = {Z), Z2, ...} for object O 
being tracked, Bayesian and DS methods are developed next. 


Assuming conditional independence, one has the Bayes method: 


P(X Z) P(X Zz) | P(X) 
P(X|Z, N Z) = : (6) 


È PX | Z1) P(X | Z)/ P(X) 


i=1 


With no information from Z; or Z3, then P(X | Z;, Z2) = P(X). Without 23, then P(X | Z1, Z2) = P(X | Z,) and without Z, then 
P(X | Zi, Z2) = P(X | Z2). Using Dezert’s formulation, then the denominator can be expressed as a normalization coefficient: 


tiny (Ø) = 1- È PŒ] Z) PX |Z) (7) 
Xi;Xj|I XiNXj 


Using this relation, then the total probability mass of the conflicting information is 


PX|Z  Z) = 1 


1 
Tema P(X| Z) P(X| Z) (8) 


which corresponds to Dempster’s rule of combination using Bayesian belief masses with uniform priors. When the prior’s 
are not uniform, then Dempster’s rule is not consistent with Bayes’ Rule. For example, let mo (X) = P(X), mı (X) = P(X | Z1), 
and m (X) = P(X | Z2), then 


_ mo (X) mi (X) m(X) _ __ P(X) P(X Zi) P(X) Z 


m(X) -m 
1 o12 (Ø) > P(X) P(X |Z) PX |Z) 
i=l 


(9) 


Thus, methods are needed to deal with non-uniform priors and appropriately redistribute the conflicting masses. 


5. DEZERT-SMARANDACHE THEORY (DSmT) 


Recent advances in DS methods include Dezert-Smarandache Theroy (DSmT). DSmT is an extension to the Dempster- 
Shafer method of evidential reasoning which has been detailed in numerous papers and texts: Advances and applications of 
DSmT for information fusion (Collected works), Vols. 1-3 [76]. In 2002, Dezert [77] introduced the methods for the 
reasoning and in 2003, presented the hyper power-set notation for DSmT [78]. Recent applications include the DSmT 
Proportional Conflict Redistribution rule 5 (PCR5) applied to target tracking. Other applications of DSmT can be found in 
the list of references at (http://www.onera.fr/staff/jean-dezert/) . 





The key contributions of DSmT are the redistributions of masses such that no refinement of the frame © is possible unless a 
series of constraints are known. For example, Shafer’s model [79] is the most constrained DSm hybrid model in DSmT. 
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Since Shafer’s model, authors have continued to refine the method to more precisely address the combination of conflicting 
beliefs [80, 81, 82] and generalization of the combination rules [83, 84]. An adaptive combination rule [85] and rules for 
quantitative and qualitative combinations [86] have been proposed. Recent examples for sensor applications include 
electronic support measures, [87, 88], physiological monitoring sensors [89], and seismic-acoustic sensing [90]. 


Here we use the Proportional Conflict Redistribution rule no. 5 (PCR5) and no. 6 (PCR6) and the Dezert-Smarandache 
Probability (DSmP) selections which are discussed below. We replace Smets’ rule [80] by the more effective PCR5 or 
eventually the more simple PCR6 and replace the pignistic transformation by the more effective DSmP transformation to 
estimate target classification probabilities. All details, justifications with examples on PCR5 and PCR6 fusion rules and 
DSmP transformation can be found freely from the web in the DSmT compiled texts [76], Vols. 2 & 3. 


5.1. PCR5 and PCR6 fusion rules 


In the DSmT framework, the PCRS is used generally to combine the basic belief assignment (bba)’s. PCRS transfers the 
conflicting mass only to the elements involved in the conflict and proportionally to their individual masses, so that the 
specificity of the information is entirely preserved in this fusion process. Let m,(.) and m2(.) be two independent bba’s, then 
the PCRS tule is defined as follows (see [76], Vol. 2 for full justification and examples): mpcrs(@) = 0 and VX € 2° \ {18}, 
where Ø is the null set and 2° is the power set: 


W441 : (X2 (X% 1X e ` 
maces) Do mm + Dy Reece * D+ za (a 


© 
X1;X2 €2 X2 €2 
Xi) NX2=xX X2NX=D 


where f is the interesting and all denominators in the equation above are different from zero. If a denominator is zero, that 
fraction is discarded. Additional properties and extensions of PCR5 for combining qualitative bba’s can be found in [76], 
Vol. 2 & 3. All propositions/sets are in a canonical form. A variant of PCRS, called PCR6 has been proposed by Martin and 
Osswald in [91], Vol. 2, for combining more than 2 sources. PCR6 coincides with PCRS when one combines two sources. 
The difference between PCR5 and PCR6 lies in the way the proportional conflict redistribution is done as soon as three or 
more sources are involved in the fusion. For example, let’s consider three sources with bba’s m,(.), m2(.), and m;(.), A N B= 
© for the model of the frame ©, and m,(A) = 0.6, m(B) = 0.3, and m3(B) = 0.1. With PCRS the partial conflicting mass 
m,(A) m2(B) m3(B) = (0.6)(0.3)(0.1) = 0.018 is redistributed back to A and B only with respect to the following proportions 
respectively: x,’ = 0.01714 and xp" = 0.00086 because the proportionalization is: 


xa? PS T xp S — _m,(4) m(B) mB) m 
m(A) ` mB) mB) m,(4) + m(B) m,(B) (11) 





PCRS PCRS 
XB 0.01 


„X4 __ xp 0B 
thatis “9.6 = 0.30.) ~ 0.6 +0.03 ~ 02857 


thus xa?’ = 0.60 (0.02857) ~ 0.01714 
xg" = 0.03 (0.02857) ~ 0.00086 


With the PCR6 fusion rule, the partial conflicting mass m,(A) m2(B) m3(B) = (0.6)(0.3)(0.1) = 0.018 is redistributed back to 4 
and B only with respect to the following proportions respectively: x4"°"° = 0.0108 and xg? = 0.0072 because the PCR6 
proportionalization is done as follows: 


ta E too _ xpa 7 m,(A) m(B) m(B) (12) 
mA) mB) mB) m(A) + m(B) + m3(B) 











that is 
ye Heer ce S aS 7 0.018 - i ni 
06 03 — 01 i + 0034017" 
thus 
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xa?" = (0.6) (0.018) = 0.0108 
xp2'®°= (0.3) (0.018) = 0.0054 
xg 3 °° = (0.1) (0.018) = 0.0018 


and therefore with PCR6, one gets finally the following redistributions to A and B: 
xa ®t = (0.6) (0.018) = 0.0108 








xp PS = ypa PS + xy P= 0.0054 + 0.0018 = 0.0072 


From the implementation point of view, PCR6 is simpler to implement than PCRS. For convenience, Matlab codes of PCR5 
and PCR6 fusion rules can be found in [76]. It is worth noting that there is a strong relationship between PCR6 rule and the 
averaging fusion rule which is commonly used to estimate the probabilities in the classical frequentist interpretation of 
probabilities. Such a probability estimate cannot be obtained using DS rule, nor the PCRS rule and that is why we 
recommend to use PCR6 when combining more than two basic belief masses altogether [92]. 


5.2. The DSmP Transformation 


The DSmP probabilistic transformation is an alternative to the classical pignistic transformation which allows us to increase 
the probabilistic information content (PIC), i.e. to minimize the Shannon entropy, of the approximated subjective probability 
measure drawn from any bba. Justification and comparisons of DSmP(.) with respect to BetP(.) and to other transformations 
can be found in details in [93, 76 Vol. 3, Chap. 3]. 


BetP: The pignistic transformation probability, denoted BetP, offers a compromise between maximum of credibility Bel and 
maximum of plausibility PI for decision support. The BetP transformation is defined by BetP(@) = 0 and VX € G° \ {@}by 


X 
BetP(X) = > MAND oy (13) 


C 
yeg? M 


where G? corresponds to the hyper-power set including all the integrity constraints of the model (if any). G? = 2° if one 
adopts Shafer’s model for © and G? = D? (Dedekind’s lattice) if one adopts the free DSm model for © [76]. Cyy(1) denotes 


the DSm cardinal of the set Y, which is the number of parts of Y in the Venn diagram of the model M of the frame © under 
consideration [76, Book 1, Chap. 7]. The BetP reduces to the Transferable Belief Model (TBM) when G® reduces to classical 
power set 2° when one adopts Shafer’s model. 


DSmP transformation is defined by DSmP,(@) = 0 and VX € G? \ {Ø} by: 


>» m(Z) +£% (XN Y) 


ZcxNY 





DSmP(X) = > ee) =) mY) (14) 
ten S Mre 
ZC Y 
cZ) =1 


where C(X N Y) and C(Y) denote the cardinals of the sets X N Y and Y respectively; ¢ > 0 is a small number which allows to 
reach a highest PIC value of the approximation of m(.) into a subjective probability measure, and Z is the new evidence. 


Usually £ = 0, but in some particular degenerate cases, when the DSmPg=0(.) values cannot be derived, the DSmPg>ọ values 
can however always be derived by choosing € as a very small positive number, say € = 1/1000 for example in order to be as 
close as we want to the highest value of the PIC. The smaller g, the better/bigger PIC value one gets. When £ = 1 and when 
the masses of all elements Z having C(Z) = 1 are zero, DSmPg-=1(.) = BetP(.). 
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6. COMPARATIVE RESULTS OF DS, BAYES, AND PCR5-BASED TRACKING 


Here we simulate two scenarios of the three rules: Bayesian, Dempster-Shafer, and PCRS rules of combination. For each 
scenario, we assume that the target information is collected from a sensor that is precise in the position measurements, but 
the uncertainty in either the sensor position accuracy or the classification information results in a confusion matrix (CM) 
formulation. With a two object representation being tracked (e..g, the standard fighter/cargo example), we have CM = [O; 
O2; O2 Oj]. In the first scenario, the target classification is CM = [0.75 0.25; 0.25 0.75], and the belief mass 
results are shown in Figure 2. 
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Figure 2. (a) Object 1 and (b) Object 2 tracking and identification results. CM = [0.75 0.25; 0.25 0.75] 


Figure 2 demonstrates that while there is uncertainty in the object tracking and classification, both the DS and Bayesian 
methods are close. The PCRS results in better accuracy. Both DS and Bayesian methods have difficulty when the 
measurements change and suffer from a prior evidence biasing. In the next scenario, we decrease the sensor 
classification/ID accuracy; which results in more conflict in the analysis. For the Scenario 2 sensor model, we use CM = 
[0.65 0.35; 0.35 0.65]. 
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Figure 3. (a) Object 1 and (b) Object 2 tracking and identification results. CM = [0.65 0.35; 0.35 0.65]. 


Figure 3 illustrates differences between the three methods. DS tracking methods are able to improve over standard 
Bayesian methods when there is conflict in the measurements (Fig 3a scan 10 to 20). However, as shown in Figure 3, the 
PCRS demonstrates an ability to track and ID the target when the measurement information is conflicting and changing 
(Fig 3a scan 25 to 50). The simple example illustrates the power of the PCRS rule over standard DS and Bayesian 
methods to deal with conflicting, imprecise, and variations in target measurements for target tracking. 
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7. CONCLUSIONS 


Conventional tracking techniques have difficulty in identifying targets when the number of targets is not known a priori., 
the targets are maneuvering, and there is conflict in the measurements. Throughout the last two decades, numerous 
researchers have explored Dempster-Shafer (DS) evidential (i.e., belief function) reasoning to solve the requirements of 
simultaneous tracking and identification. This paper has provided a literature review of most of the available publications 
that utilize the DS method in target, group/cluster, and multisensor tracking. Through a review of Bayesian, DS, and PCR5 
formulations; we presented a simulated comparative example to demonstrate the current state-of-the-art methods such as 
DSmT research [94]. The PCR5 method can be extended to nonlinear tracking and ID algorithms, coordinated with users 
for assisted tracking, and can enhance conventional covariance and information filter formulations [95]. The presented 
PCRS5 technique demonstrates promise for multitarget tracking problems and warrants further exploration with real-world 
data where environmental effects, occlusions, lost sensor data, and unknown targets [96] are standard. 
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Tracking Applications with Fuzzy-Based Fusion Rules 


Albena Tchamova 
Jean Dezert 


Abstract—The objective of this paper is to present and 
evaluate the performance of a particular fusion rule based on 
fuzzy T-Conorm/T-Norm operators for two tracking applications: 
(1) Tracking Object’s Type Changes, supporting the process of 
identification (e.g. friendly aircraft against hostile ones, fighte 
against cargo) and consequently for improving the quality of 
generalized data association; (2) Alarms identificatio and pri- 
oritization in terms of degree of danger relating to a set of a 
priori defined out of the ordinary dangerous directions. The 
aim is to present and demonstrate the ability of TCN rule to 
assure coherent and stable way for identificatio and to improve 
decision-making process in temporal way. A comparison with 
performance of DSmT based PCRS5 fusion rule and Dempster’s 
rule is also provided. 


Keywords—Objects’ type identification Alarm classification 
Data fusion; DSmT, TCN rule, PCRS rule, Dempster’s rule. 


I. INTRODUCTION 


An important function of each surveillance system is to 
keep and improve targets tracks maintenance performance, as 
well as to provide a smart operational control, based on the 
intelligent analysis and interpretation of alarms coming from 
a variety of sensors installed in the observation area. Targets’ 
type estimates can be used during different target tracking pro- 
cess stages for improving data to track association and for the 
quality evaluation of complicated situations characterized with 
closely spaced or/and crossing targets [1], [2]. It supports the 
process of identification e.g. friendly aircraft against hostile 
ones, fighte against cargo. In such case, although the attribute 
of each target is invariant over time, at the attribute-tracking 
level the type of the target committed to the (unresolved) 
track varies with time and must be tracked properly in order 
to discriminate how many different targets are hidden in the 
same unresolved track. Alarms classificatio and prioritization 
[3],[41,[51,[6],[7],[8] is very challenging task, because in case 
of multiple suspicious signals (relating to a set of a priori 
defined out of the ordinary dangerous directions), generated 
from a number of sensors in the observed area, it requires 
the most dangerous among them to be correctly recognized, 
in order to decide properly where the video camera should be 
oriented. There are cases, when some of the alarms generated 
could be incorrectly interpreted as false, increasing the chance 
to be ignored, in case when they are really significan and 
dangerous. That way the critical delay of the proper response 
could cause significan damages. In both cases above, the 
uncertainty and conflict encountered in objects’ and signals 
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data, could weaken or even mistake the respective surveillance 
system decision. That is why a strategy for an intelligent, scan 
by scan, combination/updating of data generated is needed in 
order to provide the surveillance system with a meaningful 
output. In this paper we focus our attention on the ability 
of the so called T-Conorm-Norm (TCN) fusion rule, define 
within Dezert-Smarandache Theory (DSmT) of plausible and 
paradoxical reasoning to improve the process of data fusion 
and to successfully finaliz the decision-making procedures in 
both described surveillance cases. This work is based on pre- 
liminary research in [9],[10]. In section II we recall basics of 
Proportional Conflic Redistribution rule no.5 (PCRS), define 
within DSmT. Basics of PCRS based TCN fuzzy fusion rule 
are outlined in section III. Section IV presents the problem of 
alarms classificatio and examine the ability of TCN fusion 
rule to solve it. In section V the performance of TCN rule 
is analyzed related to the problem of target type tracking. In 
both sections, a comparative analysis of TCN rule solution with 
those, obtained by PCR5 and Dempster-Shafer’s (DS) rule is 
provided. Concluding remarks are given in section VI. 


II. BASICS OF PCR5 FUSION RULE 


The general principle of Proportional Conflic Redistribu- 
tion rules is to: 1) calculate the conjunctive consensus between 
the sources of evidences; 2) calculate the total or partial 
conflictin masses; 3) redistribute the conflictin mass (total 
or partial) proportionally on non-empty sets involved in the 
model according to all integrity constraints. The idea behind 
the Proportional Conflic Redistribution rule no. 5 define 
within DSmT [9] (Vol. 2) is to transfer conflictin masses 
(total or partial) proportionally to non-empty sets involved in 
the model according to all integrity constraints. Under Shafer’s 
model assumption of the frame ©, PCRS combination rule for 
only two sources of information is define as: mpcrs(0) = 0 
and YX € 2° \ {0} 


mpcrs(X) = m12(X)+ 
y [ m(X)?m2(X2) mMm2(X)?m1(X2) 
m(X) + m(X2) © mo(X) + mi(X2) 





(1) 


X2€2°\{X} 
X2NX=0 


All sets involved in the formula (1) are in canonical form. 
mMı2(X) corresponds to the conjunctive consensus, i.e: 


m42(X) = 5 mı(Xı)mə(X2). 


X1, X2€2° 
XıNX2=X 
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All denominators are different from zero. If a denominator 
is zero, that fraction is discarded. No matter how big or 
small is the conflictin mass, PCRS mathematically does a 
better redistribution of the conflictin mass than Dempster- 
Shafer’s rule since PCRS goes backwards on the tracks of the 
conjunctive rule and redistributes the partial conflictin masses 
only to the sets involved in the conflic and proportionally to 
their masses put in the conflict considering the conjunctive 
normal form of the partial conflict PCRS5 is quasi-associative 
and also preserves the neutral impact of the vacuous belief 
assignment. 


II. BASICS OF TCN FUSION RULE 


The T-Conorm-Norm rule of combination [11] represents 
a class of fusion rules based on specifie fuzzy t-Conorm, t- 
Norm operators [16]. Triangular norms (t-norms) and Triangu- 
lar conorms (t-conorms) are the most general families of binary 
functions that satisfy the requirements of the conjunction 
and disjunction operators, respectively. TCN rule is define 
within DSmT based PCRS fusion rule. Under Shafer’s model 
assumption of the frame ©, the TCN fusion rule for only 


two sources of information is define as: mrcn(O) = 0 and 
VX € 2° \ {0} 


mrcon(X) = M12(X)+ 
m,(X).Tnorm{m,(X), m2(X2)} 
3 | Tconorm{m (X), m2(X2)} 





X2E2°\{ xX} 
XenxX=0 


m(X).Tnorm{m2(X),m1(X2)} 
Tconorm{m2(X),m1(X2)} 





] o 


where ™m12(X) corresponds to the conjunctive consensus, 
obtained by: 


Mi2(X)= X  Tnorm{mi(X1),mo(X2)}. 


XXe? 
XıNX2=X 


TCN fusion rule requires a normalization procedure : 


mron(X) 


X xez Mron(X) 
XAO 





mron(X) = 


The attractive features of TCN rule could be define as: very 
easy to implement, satisfying the impact of neutral Vacuous 
Belief Assignment; commutative, convergent to idempotence, 
reflect majority opinion, assures adequate data processing 
in case of partial and total conflic between the information 
granules. The general drawback of this rule is related to the 
lack of associativity, which is not a main issue in temporal 
data fusion. 


IV. ALARMS CLASSIFICATION APPROACH 


The approach assumes all the localized sound sources to 
be subjects of attention and investigation for being indication 
of dangerous situations. The specifi input sounds’ attributes, 
emitted by each source, are sensor’s level processed and 
evaluated in timely manner for their contribution towards 
correct alarms’ classificatio (in term of degree of danger). 
The applied algorithm considers the following steps: 
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e Definin the frame of expected hypotheses as 
follows: © = {6, = (E)mergency,@g = 
(A)larm,63 = (W)arning}. Here Shafer’s model 
holds and we work on the power-set: 2° = 
{0, E, A, W,BUA,BUW,AUW,EUAU W}. 
The hypothesis with a highest priority is Emergency, 
following by Alarm and then Warning. 


e  Definin an input rule base to map the sounds’ at- 
tributes (so called observations) obtained from all 
localized sources into non-Bayesian basic belief as- 
signments mops (.). 


e At the very firs time moment k = 0 we start with 
a priori basic belief assignment (history) set to be a 
vacuous belief assignment Mni EU AUW) =1, 
since there is no information about the firs detected 
degree of danger according to sound sources. 


e Combination of currently recetved measurement’s bba 
Mobs(.) (for each of located sound sources), based on 
the input interface mapping, with a history’s bba, in 
order to obtain estimated bba relating to the current 
degree of danger m(.) = [nist ® Mons](.). TCN rule 
is applied in the process of temporal data fusion to 
update bba’s associated with each sound emitter. 


e Flag for an especially high degree of danger has to 
be taken, when during the a priori define scanning 
period, the maximum Pignistic Probability [9] is as- 
sociated with the hypothesis Emergency. In this work, 
we assume Shafer’s model and we use the classical 
Pignistic Transformation [9], [15] to take a decision 
about the mode of danger. It is define for VA € 2° 


y XNA 
5 OA ae 


BetP(A) = 
XED® |x| 





where |X| denotes the cardinality of X. 


A. Simulation Scenario 


A set of three sensors located at different distances from 
the microphone array are installed in an observed area for 
protection purposes, together with a video camera [13]. They 
are assembled with alarm devices: Sensor 1 with Sonitron, 
Sensor 2 with £2S, and Sensor 3 with System Sensor. In 
case of alarm events (smoke, flame intrusion, etc.) they emit 
powerful sound signals with various duration and frequency of 
intermittence (Table 1), depending on the nature of the event. 


Table 1 Sound signal parameters. 











Continuous | Intermittent-I | Intermittent-II 
(Warning) (Alarm) (Emergency) 
Jint = 0Hz fint = 5Hz Tiri = 1Hz 
Tsig = 10s T sig = 30s Tsig = 60s 














The frequency of intermittencies fint, associated with the lo- 
calized sound sources is utilized in the specifi input interface 
(the rule base) below. 


Rule 1: if fing > 1Hz then Mmobs(E) = 0.9 and Moss (E U 
A) =0.1. 
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Fig. 1. TCN rule Performance for danger level estimation. 


Rule 2: if fing > 5Hz then Mobs (A) = 0.7, Mos (A U E) = 
0.2 and mops(A UW) = 0.1. 


Rule 3: if fint > OHz then Mmoss(W) = 0.6 and moys(W U 
AU E) = 0.4. 


Three main cases are estimated: the probabilities of modes, 
evaluated for Sensor 1 (associated with Emergency mode), 
Sensor 2 (associated with Alarm mode), and Sensor 3 (asso- 
ciated with Warming mode. The decisions should be governed 
at the video camera level, taken periodically, depending on: /) 
specificitie of the video camera (time needed to steer the video 
camera toward a localized direction); 2) time duration needed 
to analyze correctly and reliably the sequentially gathered 
information. We choose as a reasonable sampling period for 
camera decisions Tye. = 20sec, i.e. at every 10th scan. 


B. TCN rule performance for danger level estimation. 


Fig.l shows the values of Pignistic Probabilities of each 
mode (E, A, W) associated with three sound emitters (Ist 
source in E mode, (subplot on the top), 2nd source in A mode 
(subplot in the middle), and 3rd source in W mode, (subplot 
in the bottom)) during the all 30 scans. Each source has 
been perturbed with noises in accordance with the simulated 
Ground Truth, associated with particular sound source. These 
probabilities are obtained for each source independently as 
a result of sequential data fusion of Mo»s(.) sequence using 
TCN combinational rule. For a completeness of study and for 
comparison purposes, the respective performances of PCR5 
and DS rule are presented in fig. and fig.3 


TCN rule shows a stable, quite proper and effective behav- 
ior, following the performance of PCRS rule. A special feature 
of TCN rule performance are the smoothed estimates and more 
cautious decisions taken at the particular decisive scans. 


The results obtained show the strong ability of PCRS rule 
to take care in a coherent and stable way for the evolution of all 
possible degrees of danger, related to all the localized sources. 
It is especially significan in case of sound sources data 
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Fig. 3. Dempster’s rule Performance for danger level estimation. 


discrepancies and conflicts when the highest priority mode 
Emergency occurs. PCRS rule prevents to produce a mistaken 
decision, that way prevents to avoid the most dangerous case 
without immediate attention. A similar adequate behavior of 
performance is established in cases of lower danger priority. 


DS rule shows weakness in resolving the cases examined. 
In Emergency case, DS rule does not reflec at all new obtained 
informative observations supporting the Warning mode. This 
pathological behavior reflect the dictatorial power of DS 
rule realized by a given source [12], which is fundamental 
in Dempster-Shafer reasoning [14]. In our particular case 
however, DS rule leads to a right fina decision by coincidence, 
but this decision could not be accepted as coherent and reliable, 
because it is not built on a consistent logical ground. In cases of 
lower dangers priority (perturbed Warning and Alarm mode), 
DS rule could cause a false alarm and can deflec the attention 
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from the existing real dangerous source by assigning a wrong 
steering direction to the surveillance camera. 


V. TARGET TYPE TRACKING APPROACH 


The problem can be simply stated as follows: 


e Letk=1,2,...,Kmax be the time index and consider 
M possible target types T; € © = {61,..., 0m} in 
the environment; for example O = { Fighter, Cargo} 
and T, & Fighter, Tə ê Cargo; or © = 
{Friend, Foe, Neutral}, ete. 


e at each instant k, a target of true type T(k) € O 
(not necessarily the same target) is observed by an 
attribute-sensor (we assume a perfect target detection 
probability here). 


e the attribute measurement of the sensor (say noisy 
Radar Cross Section for example) is then processed 
through a classifie which provides a decision Tq(k) 
on the type of the observed target at each instant k. 


e The sensor is in general not totally reliable and is 
characterized by a M x M confusion matrix 


C = [ci; = P(Ta = T;|TrueTargetT ype = T;)| 


The goal is to estimate T (k) from the sequence of decla- 
rations done by the unreliable classifie up to time k, i.e. how 
to build an estimator T'(k) = f(Ta(1), Ta(2),...,Ta(k)) of 
T(k). The principle of the estimator is based on the sequential 
combination of the current basic belief assignment (drawn 
from classifie decision, i.e. our measurements) with the prior 
bba estimated up to current time from all past classifie 
declarations. 


The algorithm follows the next main steps: 


e Initialization step (i.e. k = 0). Select the target type 
frame © = {0),..., 0,7} and set the prior bba m~ (.) 
as vacuous belief assignment, i.e m~ (0,U...U0,.) = 
1 since one has no information about the firs target 
type that will be observed. 


e Generation of the current bba Mobs(.) from the cur- 
rent classifie declaration Ta(k) based on attribute 
measurement. At this step, one takes Mobs (Talk)) = 
CTy(k)Ta(k) and all the unassigned mass 1 — 
Movs(Tu(k)) is then committed to total ignorance 
0 U...U ĝm. 


e Combination of current bba Mo»s(.) with prior bba 
m7 (.) to get the estimation of the current bba m(.). 
Symbolically we will write the generic fusion operator 
as ©, so that m(.) = [mos Ð m |(.) = [m ® 
Mobs|(.). The combination ® is done according either 
Demspter’s rule (i.e. m(.) = mp(.)) or PCR5 rule 
(i.e. m(.) = mpcrs(.)). 


e Estimation of True Target Type is obtained from m/(.) 
by taking the singleton of O, i.e. a Target Type, having 
the maximum of belief (or eventually the maximum 
Pignistic Probability). 


e set m~(.) = m/(.); do k = k + 1 and go back to step 
b). 
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Fig. 4. Estimation of belief assignment for Cargo type. 


A. Simulations results 


In order to evaluate the performances of TCN-based 
estimator, a set of Monte-Carlo simulations on a very 
simple scenario for a 2D Target Type frame, ie. © = 
{(F)ighter, (C)argo} is realized for classifie with a follow- 


ing confusion matrix: 
gz o] 


0.1 0.9 


We assume there are two closely spaced targets: Cargo and 
Fighter. Due to circumstances, attribute measurements received 
are predominately from one or another and both targets gen- 
erates actually one single (unresolved kinematics) track. To 
simulate such scenario, a Ground Truth sequence over 100 
scans was generated. The sequence starts with the observation 
of a Cargo type and then the observation of the target type 
switches two times onto Fighter type during different time 
duration. At each time step k the decision Ty(k) is randomly 
generated according to the corresponding row of the confusion 
matrix of the classifie given the true target type (known in 
simulations). Then the algorithm from above is applied. The 
simulation consists of 10000 Monte-Carlo runs. The computed 
averaged performances (on the base of estimated belief masses 
obtained by the tracker) are shown on the figure 4 and 5. 
They are based on TCN fusion rule realized with different 
t-conorm and t-norm functions. On the same figures for a 
comparison purposes, the respective performances of PCR5 
and DS rule are presented. It is evident, that PCRS fusion rule 
outperforms the results based on TCN rule, because PCRS al- 
lows a very efficien Target Type Tracking, reducing drastically 
the latency delay for correct Target Type decision. TCN fusion 
rule shows a stable and adequate behavior, characterized with 
more smoothed process of re-estimating the belief masses in 
comparison to PCRS. TCN fusion rule with t-conorm=max and 
t-norm=bounded product reacts and adopts better than TCN 
with t-conorm=sum and t-norm=min, followed by TCN with 
t-conorm=max and t-norm=min. 
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Fig. 5. Estimation of belief assignment for Fighter type. 


presented: (1) Tracking Object’s Type Changes, supporting 
the process of identification (2) Alarms identificatio and 
prioritization in terms of degree of danger relating to a set 
of a priori defined out of the ordinary dangerous directions. 
The ability of TCN rule to assure coherent and stable way 
of identificatio and to improve decision-making process in 
temporal way are demonstrated. Different types of t-conorm 
and t-norms, available in fuzzy set/logic theory provide us with 
richness of possible choices to be used applying TCN fusion 
tule. The attractive features of TCN rule is it’s easy imple- 
mentation and adequate data processing in case of conflict 
between the information granules. 
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URREF Reliability versus Credibility in Information 
Fusion (STANAG 2511) 


Erik Blasch 
Kathryn B. Laskey 
Anne-Laure Jousselme 


Abstract— For many operational information fusion systems, both 
reliability and credibility are evaluation criteria for collected 
information. The Uncertainty Representation and Reasoning 
Evaluation Framework (URREF) is a comprehensive ontology that 
represents measures of uncertainty. URREF supports standards such 
as the NATO Standardization Agreement (STANAG) 2511, which 
incorporates categories of reliability and credibility. Reliability has 
traditionally been assessed for physical machines to support failure 
analysis. Source reliability of a human can also be assessed. 
Credibility is associated with a machine process or human 
assessment of collected evidence for information content. Other 
related constructs for URREF are data relevance and completeness. 
In this paper, we seek to develop a mathematical relation of weight of 
evidence using credibility and reliability as criteria for 
characterizing uncertainty in information fusion systems. 


Keywords: Reliability, Credibility, URREF, PCRS, STANAG2511 


I. 


Information fusion is based on uncertainty reduction; wherein 
the International Society of Information Fusion (ISIF) 
Evaluation of Techniques of Uncertainty Reasoning Working 
Group (ETURWG) has had numerous discussions on 
definitions of uncertainty. One example is the difference 
between reliability and credibility, which is called out in 
NATO STANAG 2511 [1]. To summarize these ETURWG 
discussions, we detail an analysis of credibility and reliability. 


INTRODUCTION 


Information fusion consumers comprise users and machines of 
which the man-machine interface requires understanding of 
how data is collected, correlated, associated, fused, and 
reported. Simply stating an uncertainty representation of 
“confidence” is not complete. From URREF discussions [2]: 


reliability relates to the source, and 
credibility refers to the content reported. 


There are scenarios in which reliability and credibility need to 
be differentiated. Examples of information fusion application 
areas include medical, legal, and military domains. A common 
theme is involvement of humans in aggregating information. 
In many situations, there is cause for concern about the 
reliability of the source that may or may not be providing an 
accurate and complete representation of credible information. 
In cases where there is a dispute (e.g., legal), the actors each 
seek their own interests and thus are asked a series of 
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questions by their own and opposing representations to judge 
the veracity of their statements. 


Weight of Evidence (WOE) is addressed in various fields (risk 
analysis, medical domain, police, legal, and information 
fusion). In addition to credibility and reliability, Relevance 
assesses how a given uncertainty representation is able to 
capture whether a given input is related to the problem that 
was the source of the data request. A final metric to consider is 
completeness, which reflects whether the totality of evidence 
is sufficient to address the question of interest. These criteria 
relate to high-level information fusion (HLIF) [3] systems that 
work at levels three and above of the Data Fusion Information 
Group (DFIG) model. For the URREF, we then seek a 
mathematical representation the weight of evidence: 


WOE =f (Reliability, Credibility, Relevance, Completeness) (1) 


where fis an function to be defined with operations on how to 
combine such as a utility analysis. 


Sect. II. provides related research and Sect. III overviews 
information fusion. Sect. IV discusses the weight of evidence 
including relevance and completeness. Sect. V describes the 
modeling of reliability and credibility with Sect. VI providing 
a simulation over evidence processing. Sect. VII provides 
discussion and conclusions. 


Il. BACKGROUND 


There are many examples of reliability analysis for system 
components [4]. Typically, a reliability assessment is 
conducted on system parts to determine the operational life of 
each component over the entire collection of parts [5]. A 
reliability analysis can consist of many attributes such as 
survivability [6], timeliness, confidence, and throughput [7, 8]; 
however the most notable is time to failure [9]. Reliability is 
typically modeled as a continuous analysis of a part; however, 
a discrete analysis can conducted for the number of failures in 
a given period of time [10]. Real-time analysis requires 
information fusion between continuous and discrete analysis 
over new evidence [11], covariance analysis [12, 13], and 
resource analysis [14] to control sensors. 


To assess the performance of sensors (and operators) requires 
analysis of the physical reliability of components. Data fusion 
can aid in fault detection [15], predictive diagnostics [16], 
situation awareness [17], and system performance. A model of 
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reliability includes time-dependent measures for operational 
lifetime analysis and controllability which are aspects of a data 
fusion performance analysis [18]. Use of multiple systems can 
aid in reducing failures through redundancy or system 
reconfiguration in response to failed sensors [19] for such 
applications as robotics [20, 21], risk analysis for situation 
awareness [22, 23], and cyber threats [24, 25, 26]. 


Time-dependent measures such as times between failures are 
appropriate for processes that operate over time to produce a 
stream of outputs; and failures can render the output stream 
unreliable. For systems that respond to discrete queries or 
produce alerts, such as human operators in a fusion center or 
pattern recognition systems, reliability is assessed through 
correspondence between outputs and the actual situation. The 
confusion matrix (CM) is a typical measure [27]. Reliability 
also relates to the opinions of observers [28]. 


Credibility To analyze credibility of evidence, we can use 
probabilistic or credibilistic frameworks such as Bayes, 
Dempster-Shafer, or following proportional conflict 
redistribution (PCR) principle, etc. [29, 30, 31]. Credibility of 
a hypothesis can be assessed through its prior probability or 
belief; and also through conflict: information is more credible 
when it does not conflict with other information. 


To summarize, 


e Reliability is an attribute of a sensor or other information source, 
and measures the consistency of a measure of some phenome- 
non. Reliability can be assessed by variance, probability of 
occurrence, and/or a small spatial variance of precision. 


e Credibility, also known as believability, comprises the content of 
evidence captured be a sensor which includes veracity, 
objectivity, observational sensitivity, and self-confidence. 


Reliability from the engineering design domain (e.g., mean 
time between failures) refers to consistent ability to perform a 
function, and reliability of a source means consistently 
measuring the target phenomenon. It may be useful to model 
source failures over time using an exponential or Poisson 
distribution. For information fusion and systems analysis, we 
need both a source element (reliability) as well as a content 
element (credibility) to characterize information quality. Next, 
we describe the information model that consists of data 
sources from human and machines that requires uncertainty 
analysis. 


HI. INFORMATION FUSION 


A. Information Fusion Evaluation 


Information fusion combines information from multiple 
sources, distributions [32], or information over various 
system-level model processing levels as described in the Data 
Fusion Information Group (DFIG) model [33, 34, 35], 
depicted in Figure 1. The DFIG model outlines various 
processes for information fusion such as object assessment 
[36] (Level 1 — L1), situational assessment (L2), impact 
assessment (L3), and resource management (L4). Data and 
information fusion can be applied to assess the operating 
performance of algorithms [37], sources (reliability), as well 
as message content (credibility). For system-level analysis, it 


410 


is important to look at source context reliability of humans 
(L5) and data sources for sensor (L4) and mission 
management (L6). 
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Figure 1 - DFIG Information Fusion model. 


In the DFIG model, the goal is to separate information fusion 
(LO-L3) from sensor control, platform placement, and user 
selection to meet mission objectives (L4-L6) [38, 39, 40]. 
Information fusion across all the levels includes many metrics 
that need to be evaluated over uncertainty measures [41]. 
Challenges for information fusion, both at the hardware (i.e. 
components and sensors) and the software (i.e. algorithms and 
processes) levels were addressed by the ETURWG 
[http://eturwg.c4i.gmu.edu] [2]. Definitions of uncertainty 
measures such as accuracy [42], precision [43], reliability, and 
credibility are important for measures of effectiveness 
including validity and verification [44]. For example, accuracy 
(i.e., validity) measures distance from the truth, while 
precision (i.e., reliability) measures repeatability of results. 


Examples of information fusion include tracking accuracy 
[45, 46], tracking filter credibility [47], and object detection 
credibility [48, 49] which are important for information 
quality and quality of service metrics [50]. 


B. NATO STANAG 2511 


For STANAG 2511, as an update to STANAG2022, there are 
general listings of categories for reliability and credibility that 
are of interest to the ETRUWG [51, 52, 53]. Table 1 lists the 
STANAG 2511 issues that provided initial discussion for the 
ETURWG and the subsequent discussions in the URREF. 
Reliability and credibility are independent criteria for 
evaluation. The resultant rating will be expressed in the 
appropriate combination of letter and number (STANAG 
2511). Thus information received from a "usually reliable" 
source which is adjusted as "probably true" will be rated as 
"B2". Information from the same source of which the "truth 
cannot be judged" will be rated as "B6". 


The URREF ontology, shown in Figure 2, distinguishes 
between reliability and credibility in evidence handling and 
evidence processing; respectively. In this paper, we utilize the 
STANAG 2511 definitions of reliability (of source) and 
credibility (of information). From the ETURWG discussions, 
credibility and reliability also relate to weight of evidence, 
relevance, and completeness; although others are currently 
being explored. 
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Table 1: STANAG 2511 Reliability and Credibility Relations and Definitions 



























































RELIABILITY CODE EXPLANATION From STANAG 2511 
Completely Reliable A A tried and trusted source which can be depended upon with confidence 
Usually Reliable B A past successful source for which there is still some element of doubt in particular cases 
Fairly Reliable C A past occasionally used source upon which some degree of confidence can be based 
Not Usually Reliable D A source which has been used in the past but has proved more often than not unreliable 
Unreliable E A source which has been used in the past and has proved unworthy of any confidence 
Cannot be judged F It refers to a source which has not been used in the past 
CREDIBILITY CODE EXPLANATION From STANAG 2511 
Confirmed 1 If it can be stated with certainty that the reported information originates from another source 
than the already existing information on the same object 
Probably true 2 If the independence of the source cannot be guaranteed, but if, from the quantity and quality of 
previous reports, its likelihood is nevertheless regarded as sufficiently established 
Possibly true 3 If insufficient confirmation to establish any higher degree of likelihood, a freshly reported 
item of information does not conflict with the previously reported target behavior 
Doubtful 4 An item of information which tends to conflict with the previously reported or establish 
behavior pattern of an intelligence target 
Improbable 5 An item of information which positively contradicts previously reported information of 
conflicts with the established behavior pattern of an intelligence target in a marked degree 
Cannot be judged 6 If its truth cannot be judged 








IV. WEIGHT OF EVIDENCE 


Weight of evidence (WOE) has different meanings in different 
contexts. A commonality is the need to integrate different 
sources or lines of evidence to form a conclusion or a decision. 


In the field of risk analysis, WOE consists of a set of methods 
developed to assess the level of risks associated to factors or 
causes [54]. In most cases, WOE is a means of synthesizing 
information, while the solution adopted for 

weighing evidence is not explicit, or the 

evidence is presented without any interpretation. 

While some approaches rely on scoring 

techniques (see for instance research on sedi- 

ments assessment described in [55]), the overall 

solutions remain qualitative in nature, developed 

for particular applications and poorly adaptable. 

Further discussion on WOE, as tackled within 

the risk analysis area is provided in [56]. 


WOE is addressed in a similar way in the 
medical domain, in relation to the rise of a new 
set of medical practices known as “evidence 
based medicine”, promoting clinical solutions 
supported by practical experience, for which 
scientific support is not (yet) available. 


From a different perspective, WOE is used in 
the law and policy domain to convey a 
subjective assessment of an expert analyzing 
different items of evidence, most often in 
relation to a causal hypothesis [57]. Intuitively, 
the concept is used to signify that the value of 
evidence must be above a critical threshold to 
support decisions or conclusions. In law, stand- 
ards of evidence are recognized (for instance a 
three-level standard classifies evidence as 
“preponderance”, “clear and convincing” and 
“beyond a reasonable doubt”), but experts will 






still exercise their judgment on the strength of evidence, as 
there is no methodology to assess this parameter. Without such 
methodologies, the variance in expert’s judgments could be 
important, as subjective factors shape inevitably the outcome 


of the evidence in the evaluation. 


A. Weight of evidence for information fusion 


In the field of information fusion, WOE captures the intuition 
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Figure 2 — URREF Ontology: Criteria Class [2]. 
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that there is more or less evidence in the data, and this can be 
related to different parameters: the value of information itself 
(whether a piece of evidence conveys rich or poor 
information), the credibility of this information, in conjunction 
with the reliability of its source (can or should we believe this 
information), and finally the utility (or completeness) of this 
information with respect to a considered goal or task (is this 
data adding any detail to our existent data set?). WOE is an 
attribute of information and its values should be assessed by 
following a justifiable, repeatable and commonly accepted 
process. Therefore, several solutions have been developed to 
propose assessment mechanisms. 


Among them, [58] proposes a probabilistic approach for 
information fusion where data items are weighted with respect 
to the accuracy or reliability of their source. This solution 
considers only independent information items and its 
adaptation to correlated information was developed [59]. 
In the field of evidential reasoning, the discounting operation 
introduced by Shafer [60], allows us to consider knowledge 
about the reliability of information sources. Smets and 
colleagues propose a method for learning a sensor’s reliability, 
at various detail levels defined by users [61]. This method is 
generalized in Mercier, et. al. [62] by introducing the 
contextual discounting. 


From a different perspective, [63] extends this frame in order 
to combine sources having different reliabilities and 
importance levels, while making a clear distinction between 
those notions. 


It should be noticed that all references above consider only 
attributes of sources, while the weight of evidence should also 
be a function of information credibility. Underlying the same 
intuition of assigning different importance levels to items 
when fusing information, we can also cite research on 
prioritized and weighted aggregation operators, described in 
[64] and [65]. 


B. Relevance in Information Fusion 


Relevance has these components: property relation and piece 
of evidence (POE). Relevance is often considered as a relation 
between one property (or feature) and a conditional. That 
means that a property is relevant (or related) to another one “if 
it leads us to change our mind concerning whether the second 
property holds” [66]. 


For instance, in classification, relevance criteria determine 
how well a feature (a property) discriminates between the 
classes (another property). In this case, the feature selection 
step aims at identifying the features that are most relevant to 
the classification problem. We distinguish between the filter 
mode and the wrapping mode. In the filter mode, measures of 
relevance are used to characterize the features. In the wrapping 
mode, a classifier is used and the optimal subset of relevant 
features is the one which maximizes the given performance 
measures, such as the recognition rate, the area under curve, 
etc., subject to a penalty on the number of features. Classical 
relevance measures are based on: mutual information, 
distances between probabilities, cardinality distances, etc. 


A piece of evidence (POE) is relevant if it impacts previous 
beliefs. In this case, the relevance of a piece of information 
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can only be evaluated in conjunction with the combination 
(updating, revision) operator used, as the null element and the 
properties in general may differ from one operator to another. 
For example, in Information Retrieval, the process is used to 
assess the relevance of retrieved items (documents) based on a 


given query. 


Measures of relevance are based on traditional recall and 
precision measures: Precision is the fraction of retrieved items 
that are relevant, and Recall is the fraction of relevant items 
that have been retrieved [67]. 


Relevance is defined with respect to a goal (or a context) and 
assesses quantitative and qualitative information change. 


e Quantitative approaches: In quantitative approaches, the notion 
of relevance is often intimately linked to the notion of 
independence. For instance, in classical probability theory, 
according to Gärdenfors [68], a proposition p is relevant to 
another proposition r on evidence e if p and r are conditionally 
dependent given e. 


e Qualitative approaches: In qualitative approaches, the notion of 
relevance is linked to the material implication (see for instance 
the work of Goodman [69]): If a then b, a — b, then a should be 
relevant to b. 


C. How to evaluate a Relevance Criterion? 


First, we should clarify what is the object under evaluation, or 
what do we mean by uncertainty representation (UR). We 
follow here the distinction put forward in [70] about the 
difference between uncertainty calculi and decision 
procedures. 


If UR means uncertainty calculus (UC) (mathematical 
framework, theory), then we are asking if, for instance, 
possibility theory or probability theory is able “to capture how 
a given input is relevant [...!”, and to what degree. Although 
this is a very general question with certainly no binary answer, 
some evaluation could be done. 


For instance, using a literature survey for document retrieval, 
what is needed is a notional scale. An example of a scale to be 
defined over methods, measures, or models : 


A. exist and are well developed with the theory and results are 
significant; 

B. exist but some further developments are required or results are 
not significant; 

C. are missing, or 

D. the concept is not addressed. 


We could conclude for instance probability theory is very 
good at dealing with relevance since a plethora of methods and 
measures are defined (A), compared to possibility theory for 
which only few methods exist (B). This would be an empirical 
evaluation, mainly based on a literature survey. Although we 
could conclude that a theory is very good at dealing with the 
relevance concept (numerous methods, measures, papers etc), 
an absence of evidence in this sense for another theory would 
not mean that the latter is not good. Rather it would identify a 
research gap. 


Each of the following elements can be evaluated separately: 


(UC-1) The mathematical model for uncertainty representation 
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(UC-2) The uncertainty measures 
(UC-3) The inference rules and combination operator 
(UC-4) Transformation functions 


If UR is a decision procedure (DP), we are asking if a 
particular algorithm, relying on possibly several theories, is 
able “to capture how a given input is relevant [...|”, and to 
what degree. A DP distinguishes between the method and its 
implementation (e.g., fusion algorithm). Also, note that the 
same DP could be represented by several algorithms. 


Two steps underlying may be distinguished: 


1. Identification and assessment of pieces of information (or 
properties) according to their relevance; and 
2. Filtering of irrelevant pieces of information. 


Example of an experiment to be elaborated could be: 


i. Consider a dataset with both relevant and irrelevant pieces of 
information; 

ii. Each piece of information should have been previously labeled as 
relevant or irrelevant, possibly with some degrees; 

iii. Run the decision procedure (fusion algorithm) with only relevant 

pieces of information and add progressively irrelevant (or less 

relevant) ones; and 

Evaluate the decision procedure based on other independent 

criteria such as the execution time, true positive rate, 

conclusiveness, interpretation, etc. 

We could observe for instance that a given Decision 

Procedure, say DP-A, is better than another one, say DP-B, 

because its execution time is lower with an equivalent true 

positive rate. Even if DP-A is based on evidence theory and 

DP-B is based on probability theory, concluding that evidence 

theory is better for dealing with relevance than probability 

theory is obviously not trivial and would require special care. 


iv. 


A thinner-grained assessment of relevance criterion can be 
performed by assessing separately each of the following 
elements of an Atomic Decision Procedure (ADP): 


(ADP-1) Universe of discourse 

(ADP-2) Instantiated uncertainty representation 
(ADP-3) Reasoning step 

(ADP-4) Decision step 


For instance, one could assess if one particular universe of 
discourse better allows expressing relevance concepts than 
another. Relevance contributes to WOE. Evaluating whether a 
representation is able to deal with relevance should rely on 
other criteria of the ontology (if UR is a decision procedure) 
and or on other empirical criteria to be defined (if UR is an 
uncertainty calculus). In addition to relevance affecting re- 
liability and credibility, completeness needs to be considered. 


D. Evidence Completeness 


Reliability versus credibility is highly related to completeness 
of evidence. For example, we cannot postulate that: (P1) 
reliability of a source => credibility of information 
(that is more a source is reliable, more the credibility of the 
information it provides is high) WITHOUT assuming the 
completeness of pieces of evidences available for the source. 


For example: (Ming vase): Let's consider an apparent Ming 
vase (a counterfeit or a genuine one) to be analyzed. Suppose 
that an expert provides his report based on only two attributes 
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(say the shape and color of the vase) and concludes (based on 
these two attributes/pieces of evidences only) that the vase is a 
genuine Ming vase. Because it is based on this knowledge 
only, and because both attributes fit perfectly with those of a 
genuine Ming vase, the Expert is 100% reliable (he didn't 
make a mistake) in assessing the two attributes; however, we 
are still unsure of his reliability in assessing whether the vase 
is genuine. Additional POE if available may be 100% reliable 
and support the opposite conclusion. For example, let's 
suppose that when looking at the vase we see the printed 
inscription "Made in Taiwan". So we are now sure that we are 
facing a counterfeit Ming vase. 


So we see that the reliability and credibility notions are highly 
dependent on the underlying completeness of pieces of 
evidence and the relationship of the evidence to the conclusion 
of interest. In the Ming vase example, if we treat the two 
attributes (color and shape) as complete evidence sufficient to 
establish the absolute truth, then if Expert is fully reliable, the 
information he/she provides becomes highly credible due to 
reliability of the source and completeness of the evidence. 


When there is incompleteness of POE, nothing conclusive can 
be inferred about credibility unless some additional 
assumptions are introduced about the evidence necessary to 
establish the truth. 


The fundamental question behind this, is to know if a source 
based only on local/limited knowledge (evidences) can (or 
not) conclude with an absolute certainty about an hypothesis, 
or its contrary so that any other/additional pieces of evidences 
cannot revise his/her conclusion. Depending on the standpoint 
we choose, we accept or reject (P1) which makes a big 
difference in reasoning. In summary, the ETURWG analysis 
highlights uncertainty elements of a WOE. 


E. URREF Weight of evidence 
With respect to criteria defined by URREF we can define 
weight of evidence as: 

WOE = f (Reliability, Credibility, Relevance, Completeness) 


where fis an function to be defined and relevance is related to 
the problem (or mission). 


This is a translation of the following reasoning: 


If (the source is reliable) then 

If (the information provided is credible) then 

If (this information is relevant to my problem) then 

If (this information can enrich my existent information set) then 
this information has some weight of evidence. 


The four terms above are URREF criteria, while the last 
corresponds to a task-specific parameter that affects utility. 
For instance, utility can be evaluated by taking into 
consideration a distance between the set of information 
already available and a new item to determine utility 
completeness. Next, we demonstrate a modeling technique 
that brings together reliability and credibility to instantiate 
WOE calculations. 
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V. RELIABILITY AND CREDIBILITY ANALYSIS 


A reliability assessment affects modern equipment systems 
performance capability, maintainability, usability, and the 
operational support cost. Knowing the system’s reliability is 
important for efficient and effective performance. Due to the 
high complexity of system’s engineering integration, it is 
difficult to evaluate system-level reliability. Some ways to 
estimate system-level reliability include: (1) predicting 
operational reliability based on design data, (2) statistically 
analyze operational data, or (3) develop performance models 
based on real-world operational constraints. 


Reliability prediction depends on models, such as life-cycle 
analysis. Typical models include Poisson, Exponential, 
Weibull, or Bernoulli distributions. Standard components, 
operating for a long time, may have data to support a priori 
analysis and modeling; however, the likelihood of reliability 
effectiveness is subject to real-world conditions that have not 
been modeled. For exponentially distributed failure times, the 
density function and the cumulative distribution function for 
time to failure of the system components are: 


iese ™ ; 


The physical meaning of F(® is the probability that a failure 
(doubt) occurs before the time ¢ and f4) is the failure density: 
the probability that the component will fail in a small interval 
ttAt is given by 2f(f)At. As t increases, the value of F(A) 
approaches | at £= œ. 


F(t)=1-e vs (2) 


For a fusion or reliability metric of a source, we need to map 
the semantics into quantifiable metrics based on the source 
context. Here we assume that we take discrete measurement 
and a consistent source has almost no failures. On the other 
hand, a non consistent source fails quickly. As a quick look we 
show a notional example, but realize that for human sources 
this model does not hold. For example, to ascertain a “not 
usual source” is difficult to quantify and caution and 
improvements would be forthcoming from the ETURWG. 


Classification systems process evidence features by an 
algorithm to classify evidence into classes. Results are tested 
against truth and reported using a confusion matrix (CM) [27]. 
A CM can thus be used to measure reliability of a 
classification system. A CM is an estimate of likelihoods of 
the accumulated evidence of classifier. The elements of a 
confusion matrix are c ;; = Pr{Classifier decides o ; when o ; is 
true}, where i is the true object class, j is the assigned object 
class, and i = 1, ...., N for N true classes. The CM elements 
can be represented as probabilities as c ;;= Pr{ z=j| 0 i} =p{ 
z ; | o i}. To determine an object declaration, we need to use 
Bayes’ rule to obtain p{o i| z j} which requires the class priors, 
p{o;}. We denote the priors and likelihoods as column vectors 
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For M decisions, a confusion matrix would be of the form 
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VI. RESULTS 


For the simulation, we do both reliability and credibility 
assessment formulation to model the STANAG2511 criteria 
for uncertainty representation. Note that we assume 
completeness and relevance in these simulations. 


A. Reliability 


For source reliability, the parameter of choice is à, which 
captures the rate of time between failures. Figure 3 
demonstrates the intuition that reliable and unreliable sources 
remain unreliable and reliable. However, the interesting cases 
are those which are termed “usually reliable” (code B) which 
affects the uncertainty analysis. 


Cumulative Reliability Analysis 
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Figure 3 — Reliability Analysis 


For Figure 3, a representative analysis of the reliability 
parameters are: 


Code A Completely Reliable : rX=0 

Code B Usually Reliable : à = 0.001 
Code C Fairly Reliable : 7=0.01 
Code D Not Usually Reliable : r= 0.1 
Code E Unreliable : rA=1 

Code F Cannot be judged à undefined 


C. Credibility 


For credibility, since STANAG 2511 definitions deal with 
conflicts, we utilize comparisons between Dempster-Shafer 
Theory and the PCRS rule. Setting up the modeling using CM 
of classifiers from the information content, we can develop 
representative CMs for the different definitions: 


%%% Confusion Matrices for Classifiers (two sources) 
CM1=[0.999 0.001; 0.001 0.999] 

CM2=[0.95 0.05; 0.05 0.95] 

CM3=[0.70 0.30; 0.30 0.70] 


Now, we define credibility levels as follows, based on the 
confusion matrices of the two classifiers and whether or not 
their outputs agree: 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


Confirmed: CM], outputs agree 
Probably (independently confirmed): CM2, outputs agree 
Possibly (does not conflict): CM3, outputs agree 
Doubtful (tends to conflict): CM3, outputs disagree 
Improbable (conflicts): CM1 or CM2, outputs disagree 


Figure 4 shows a comparison of the CM results of a “possibly 
true” (code 3) to validate that the PCRS rule better supports 
evidence analysis than the Dempster-Shafer method. 
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Figure 4 — DS versus PCRS for “Possibly True” (Code 3) 


Figure 5 and 6 highlight the credibility relations associated 
with a DS and PCRS5 formulation, where PCRS better 
represents an expected analysis for calculating the STANAG 
2511 credibility codes. 
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Figure 5 —DS Credibility of STANAG 2511 (Codes 1-5) 
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Figure 6 — PCRS Credibility of STANAG 2511 (Codes 1-5) 
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VII. 


In this paper, we overviewed uncertainty representation 
discussions from the ETURWG as related to the STANG 2511 
reliability and credibility. In our URREF model for weight of 
evidence, included are relevance and completeness. We 
demonstrated modeling for reliability and credibility and 
provided simulations as related to evidence reasoning methods 
of the PCRS rule. These results provide a more tractable (and 
mathematical) ability to calculate the STANAG 2511 codes. 


CONCLUSIONS 


Reliability and credibility affect higher levels of information 
fusion (i.e. beyond Level 2 fusion) grand challenges [71] of 
uncertainty representation [72], ontologies [73, 74] and 
uncertainty evaluation [75, 76]. Future research will further 
explore the uncertainty ontology within the URREF, use cases 
of real systems for a combined credibility/reliability 
assessment, and mathematical inclusion of other metrics such 
as relevance and completeness. 
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Application of New Absolute and Relative 
Conditioning Rules in Threat Assessment 


Ksawery Krenc 
Florentin Smarandache 


Abstract—This paper presents new absolute and relative con- 
ditioning rules as possible solution of multi-level conditioning in 
threat assessment problem. 

An example of application of these rules with respect to target 
observation threat model has been provided. 

The paper also presents useful directions in order to manage 
the implemented multiple rules of conditioning in the real system. 


I. INTRODUCTION 


Contemporary Command & Control systems operate with 
multiple sensors in order to elaborate consistent and complete 
information required for decision making [1]. These systems, 
however, must face another very important requirement, which 
is cooperation with other information systems. Dealing with 
information of different processing levels is inevitable conse- 
quence of the imposed demands, and requires specific tools for 
fusion in order to take this diversity into account effectively. 

As Threat Assessment is one of the most important tasks 
imposed on C2 systems [2], [3], these systems must be 
able to deal with information obtained from uncertain and 
even unreliable sources, where the quality measures are often 
subjective. For this reason Theory of Evidence seems to be an 
appropriate approach. 

Theory of Evidence known as Dempster-Shafer Theory 
(DST) [4] does not make any distinction to fusion operations 
regarding uncertainty of the gathered information. So called 
Dempster’s rule of combination has been used in order to 
combine strong evidences from reliable sources, as well as 
poor evidences from unreliable sources, and hybrid (strong 
evidence with poor evidence). For many years researchers 
have been inventing diverse combination rules as alternative to 
Dempster’s rule [5], [6], and [7]. These rules are different from 
each other mainly in the way the conflicting mass (referring to 
contradicting hypotheses) is distributed. However, according 
to knowledge of the authors, none of these rules takes into 
account possible different processing levels of the integrated 
information, which cannot be expressed with basic belief 
assignments. 

Theory of Evidence by Dezert and Smarandache (DSmT) [8] 
distinguishes two operations: combination and conditioning 
for fusion of uncertain information and integration of uncertain 
pieces of information with confirmed i.e. certain evidence 


Originally published as Krenc K., Smarandache F., Application of 
New Absolute and Relative Conditioning Rules in Threat Assessment, 
Proc. of Fusion 2013 Int. Conf., Istanbul, Turkey, July 9-12, 2013, 
and reprinted with permission. 


respectively. Aware of this fact, a certain idea of using con- 
ditioning operation (as an alternative of combination [9]) for 
the purpose of multiple level fusion has been published [10]. 
However, as it was presented in [11], [12], each of these 
solutions has its drawbacks, and in general neither is preferable 
over the other. The main disadvantage of combination as 
multiple level fusion operation is that it does not take into 
account the predominance of the conditioning information 
from the external system over the local sensor data, and 
in result it makes no distinction between the information 
processing levels. On the other hand, the main disadvantage 
of conditioning is that the condition is treated, by definition, 
as an absolute and literate fact, which is the assumption very 
hardly accepted in the real world. 

For this reason another class of fusion rules, called relative 
conditioning, has been invented. In this type of rules the pre- 
dominance of the condition over the uncertain evidence is 
stated explicitly, while the trust in the conditioning hypothesis 
is not absolute by definition. 

In this paper two of these rules will be presented as 
possible solution of the multi-level conditioning [12] in threat 
assessment problem. 


II. NEW CONDITIONING RULES 


Let © be a frame of discernment formed by n singletons 
defined as: 


O = {61,01,...,On},n > 2 (1) 
and its Super-Power Set (or fusion space): 
S® = (0,U,N, C) 2) 


which means the set © is closed under union U, intersection 
N, and complement C respectively. 
Let m(.) be a mass: 
m(.): S2 => [0,1] (3) 
and a non-empty set A C J; where I; = 01 U 02 U...0n is the 
total ignorance. 
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Conditioning of m(.|.) becomes: 


VX €S°,m(X|A)= X m(¥)+ 


Yese 
YNA=X 
+ X m(Y)-wa + dX -m(X) - wo (4) 
Yese 
YNA=0 
where: 
1A=B 
64 =e” 5 
x ee (5) 


and wo and w4 are the weights for all sets which are com- 
pletely outside of A, and respectively for all sets which are 
inside or on the frontier of A. 


wo, wa € [0,1],wo +wa =1 (6) 


For a more refined/ optimistic redistribution, all masses of the 
elements situated outside of A are redistributed, according to 
the formula (7). 


VX €S°,m(X|A)= So m(Y)+ 


Yese 
YnA=X 
m( X 
Yes Yese 
YCA $ 
m(¥)40 rig 


From the practitioner’s point of view these formulas provide 
directions on how the mass of hypotheses not involved or 
partially involved in condition should be redistributed. In order 
to explain the idea of these rules it is suggested to consider a 
simple example of a model consisting of three hypotheses: A, 
B, and C, where A and B overlap each other, and C is disjoint. 
Assume the condition is A. 

For this example, application of the rule (4) will cause the 
following action: 


— former masses of A and AMB remain unchanged, sup- 

plying A and AQB hypotheses respectively, 

— former mass of B is transferred to AQB, 

— former mass of C is transferred to A. 

When applying the absolute version of the rule (4) all 
masses are transferred exactly as described above. Otherwise, 
i.e. relative conditioning, the mass of C is weighted according 
to the given wo and w4. 

Application of the rule (7) will cause the following action: 

— former masses of A and AQB remain unchanged, sup- 

plying A and AQB hypotheses respectively, 

— former mass of B is transferred to AQB, 

— former mass of C is transferred to A and AMB propor- 

tionally to their masses 

Similarly as for the rule (4) when applying the absolute 
version of the rule (7) all masses are transferred exactly as 
described above. Otherwise, i.e. relative conditioning, the mass 
C is weighted according to the given wo and wy. 








/ A =— 
| 
\ \ J 
N XS á 
Figure 1. Mass transfer in case of application of the rule (4) 
P D 
Cc \ 
| 
A = fi 
/ A y Ñ a 
| | | 
\ \ J J 
q x y si 
Figure 2. Mass transfer in case of application of the rule (7) 


II. THREAT ASSESSMENT EXAMPLE 


In order to illustrate application of the introduced rules it 
is suggested to consider the following conditioning example 
referring to the threat assessment problem. Assume the frame 
of discernment is defined as: 


© ={F,H,U,N} (8) 


where: 


— F denotes FRIEND, 

— H denotes HOSTILE, 

— U denotes UNKNOWN, 
— N denotes NEUTRAL. 


Additionally assume: 


S=HNU (9) 
A=FNU (10) 
K=FOH (11) 
J=FOHNU (12) 


where: 


— S denotes SUSPECT, 

— A denotes ASSUMED FRIEND, 

— K denotes FAKER i.e. FRIEND acting as HOSTILE for 
training purposes, [13], [14], [15], and [16] 
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— J denotes JOKER i.e. FRIEND acting as SUSPECT for 
training purposes, [13], [14], [15], and [16]. 

Consider a scenario, where a local system, equipped with 
sensors and performing target threat observation and infor- 
mation fusion, gets informed by an external system about 
its decision, referring to the observed target. The decision 
transferred to the local system is that the target is FRIEND, 
which performs a conditioning information. 

j N 


y N 
fe — a] B 


\ 
| 


he J=FOHA T~ 
Ne ae PADS. VA A y 
-a B 
\ 7 A B 
X A 7 N 
E U a x 
SS CEE = / N it 
| | 
| | 
\ | 
\ / 
Ne i 
A ee 
Figure 3. Venn’s diagram of the observed target threat 


Figure 3. shows a Venn’s diagram describing the tar- 
get threat observation model, where information obtained 
from the external system has been colored in gray. Notice 
that the model refers to observation of the target threat 
(not to the target threat in itself), which means it describes 
what the target looks like (not what the target really is). This is 
significant for justification why FAKER may be defined as 
the intersection of FRIEND and HOSTILE, not as a subset 
of FRIEND, which is by definition of FAKER in [13], [14], 
[15], and [16]. 

Consider that the local system has already performed sensor 
fusion and its results are summarized in basic belief assign- 
ment (bba) below: 


m(F) =0.2, m(H)=0.1, m(U)=0.1 
m(A)=0.1, m(S)=0.1, m(kK)=0.1 
m(J) =0.2, m(N)=0.1 


Application of (4) leads to the following updated bba for 
absolute (wo = 0 and w4 = 1) conditioning: 


m(F|F) =0.3, m(H|F)=0, m(U|F)=0 
m(A|F)=0.2, m(S|F)=0, m(K|F) = 0.2 
m(J|F) = 0.3, m(N|F)=0 


For the relative conditioning with the following weights 
wo = 0.3 and w4 = 0.7 one should get: 


m(F|F) =027, m(H|F)=0,  m(U|F) =0 
m(A|F) =0.2, m(S|F) =0, m(K|F) = 0.2 
m(J|F) =0.3,  m(N|F) = 0.03 


For the absolute opposite (wo = 1 and w4 = 0) conditioning 
one should get: 


m(F|F)=0.2, m(H|F)=0, m(U|F)=0 
m(A|F)=0.2, m(S|F)=0, m(K|F) =0.2 
m(J|F) = 0.3, m(N|F)=0.1 


Application of (7) leads to the following updated bba for 
absolute conditioning: 


m(F|F) = 0.233, m(H|F)=0, m(U|F)=0 
m(A|F) = 0.217, m(S|F)=0, m(K|F) = 0.217 
m(J|F) = 0.333, m(N|F) =0 


For the relative conditioning with the following weights 
wo = 0.3 and w4 = 0.7 one should get 


m(F|F) = 0.223, m(H|F)=0,  m(U|F)=0 
m(A|F) = 0.212, m(S|F)=0,  m(K|F) =0.212 
m(J|F) = 0.323, m(N|F) = 0.03 


For the absolute opposite conditioning one should get: 


m(F|F)=0.2, m(H|F)=0, m(U|F)=0 
m(A|F)=0.2, m(S|F)=0, m(K|F) =0.2 
m(J|F) = 0.3, m(N|F) =0.1 


Analysis of the obtained results shows that there are sub- 
stantial differences in results between conditioning rules (4) 
and (7) for the considered case. Depending on the particular 
rate of belief (values of wo and w4) in condition the mass 
of the condition (FRIEND), as well as subsequent masses 
of hypotheses contained in the hypothesis of the condition 
(FAKER, JOKER, ASSUMED FRIEND) have been supplied 
with masses of hypotheses not contained in the condition 
(HOSTILE, UNKNOWN, SUSPECT, and NEUTRAL). 

For both of the rules, in the first place the absolute condi- 
tioning case has been considered as a specific circumstance 
of relative conditioning. As the second, the relative condition- 
ing has been performed with given weights of wo and wy. 
Then, the absolute opposite conditioning has been presented 
as another special circumstance of relative conditioning. 

The reason for the absolute opposite conditioning in this 
case is purely illustrative. Theoretically, it could be useful 
if the condition hypothesis was complex (expressed as union 
or intersection of multiple hypotheses) and it was convenient 
to consider the complement of the condition. However, in most 
of the cases the condition, as output of the external system is 
simple. Thus, it is very unlikely that such kind of conditioning 
would be applied in threat assessment. 

Regarding the distinction in the presented rules, in this case, 
the essential difference between conditioning rules (4) and 
(7) resides in the manner the mass of NEUTRAL hypothesis 
is redistributed. For the rule (4) the mass of NEUTRAL is 
transferred completely to the mass of FRIEND, while for 
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the rule (7) the mass of NEUTRAL is transferred to FRIEND, 
JOKER, FAKER, and ASSUMED FRIEND proportionally 
to their masses. In other words, in case of the rule (7) 
the redistribution is performed with the higher degree of trust 
in the adequacy of the target threat observation model. There- 
fore it may be regarded as more optimistic in comparison 
to pessimistic rule (4). 


IV. CHOOSING THE PROPER CONDITIONING RULE 


Choosing the proper rule is one of the most important 
questions related to application of any fusion techniques 
(conditioning and combination). Since there are many rules 
of combination and conditioning [8], [11], [9], [10], and even 
more possible fusion cases, the choice of any particular rule for 
the particular case could be a topic of papers for the next few 
decades. Moreover, since there are no existent standardized 
fusion cases for particular domains the choice of the optimal 
rule seems to be a philosophical problem. 

Since in this paper there are two rules of conditioning 
proposed the problem of selection of the proper one still 
holds. Additionally, each of these rules introduces weights 
(wo and w4) in order to establish the ’relativity’ of the condi- 
tioning, and setting particular values to these weights requires 
a comment. 

According to the knowledge of the authors [11], [9], [10], 
and [12], in most of the cases selection of the particular rule for 
conditioning (as well as combination) is done experimentally. 
For the particular fusion task e.g. threat assessment in Com- 
mand and Control system one chooses the rule which returns 
the closest results to the expected values. However, even within 
the particular fusion task it is possible to find situations, 
where another rule returns results substantially better than the 
previously selected one. That means two things: 


— there is no universal rule of conditioning, correct in every 
conditions, 

— if that is so, the particular fusion task should be split for 
at least two subtasks. 


In other words, the particular rule of conditioning should 
be selected dynamically according to specified circumstances 
of information integration process. 

In this section, the authors would like to define the factors 
which may influence on the choice of the particular rule 
of conditioning. 

Quality of gathered information could be regarded as a basic 
parameter that affects selection of conditioning rules. Further, 
this parameter may be decomposed for two components refer- 
ring to attribute (observation) model and data. Thus, the quality 
aggregates both: model adequacy and data precision. The fun- 
damental question is how these model adequacy and data 
precision may be assessed and transformed into the quality 
in order to make choice of conditioning rule? 

Possible solution of this problem may reside in analysis 
of bba subjected to conditioning. Bba, by definition, performs 
a kind of distribution, where subsequent masses reflect the de- 
gree of belief in particular hypotheses. If sensors are not 
reliable relatively high mass will be transferred to hypothesis 


describing complete ignorance. For instance, for the consid- 
ered case it could be I = F UH U U UN. By implication 
if the sensors are reliable the mass referring to the complete 
ignorance is zero. That may be regarded as the first insight 
in data precision. Another inference on data precision may 
be done by overview of distribution of mass over the rest 
of the hypotheses. Conciseness of the distribution means 
higher precision. Adequacy of the attribute (observation) 
model, on the other hand, may be defined by compliance of 
hypothesis of the highest mass with the hypothesis of the con- 
dition. If there exists any relation between the highest mass 
hypothesis and the condition, e.g. including or intersecting 
they may be regarded as compliant. On the other hand if they 
are disjoint they are regarded as noncompliant. 

Referring the deductions above to the features of the pre- 
sented rules a simple logic (briefly described in Table I) may 
be applied in order to choose the proper conditioning rule. 

Table I 


CHOICE OF THE CONDITIONING RULE BASED ON MODEL ADEQUACY AND 
DATA RELIABILITY 





























Model | Data Quality Description Conditioning 
poor poor poor Mmax # Cond, m(Q) T | absolute, (4) 
poor good poor Mmax # Cond, m(Q) | | absolute, (4) 
good poor poor Mmax S Cond, m(Q) T | absolute, (7) 
good good good Mmax & Cond, m(O) } relative, (7) 








If the highest mass hypothesis is not compliant with the con- 
dition, which means the attribute (observation) model is not 
adequate, no matter if the data are precise or not, in such 
case absolute conditioning should be applied with no respect 
to the attribute (observation) model. This may be achieved 
by using the rule (4) with wọ = 0 and w4 = 1. 

If the mass referring to total ignorance is relatively high and 
the highest mass hypothesis is compliant with the condition 
that means that the sensor data are poor and the attribute 
(observation) model is adequate. In such case absolute con- 
ditioning should be applied with respect to the attribute 
(observation) model which may be achieved by using the rule 
(7) with wo = 0 and w4 = 1. 

Finally, if the mass referring to total ignorance is relatively 
low and the highest mass hypothesis is compliant with the con- 
dition that means that the sensor data are reliable (good) 
and the attribute (observation) model is adequate. In such 
case relative conditioning should be applied with respect 
to the attribute (observation) model which may be achieved by 
using the rule (7) with wo,wa €E (0,1), where: wp + wy = 1. 

As a summary of this section it is worth of notice that 
particular values of the relativity’ weights (wọ and wy) 
depend only on the specific configuration of the fusion system. 
In the authors’ opinion it is pointelss to discuss any specific 
values without reference to the particular system since there 
are no general guidelines for presetting. 


V. SELECTION OF CONDITIONING RULES - EXAMPLES 


In order to illustrate the selection mechanism few more 
examples have been delivered. However, in the first place, it is 
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suggested to reconsider the example from Section III. Table I 
presents the summarized bba (before and after conditioning). 
In this very case, before conditioning performed, the dominant 
masses had referred to FRIEND and FAKER hypotheses, 
which was compliant with the condition hypothesis (FRIEND). 
That means the model was adequate. Additionally, the total ig- 
norance mass has not been defined (as nonzero), which means 
the data were reliable. According to Table I, in such case 
the relative version of the rule (7) should be selected, which 
was exactly what was decided. 



































Table II 
EXAMPLE 1: BBA BEFORE AND AFTER CONDITIONING OPERATION 
Threat \ bba m mmRCIEF) 

F 0.2 0.223 

H 0.1 0 

U 0.1 0 
A=FNU 0.1 0.212 

SHH fi 0.1 0 
K=FOH 0.1 0.212 
J=FOHNU 0.2 0.323 
N 0.1 0.03 

IT=FUHUUUN 0 0 




















In the next example it is suggested to consider bba given 
in the second column of the Table III. In this case the biggest 
mass has been assigned to total ignorance. Furthermore, there 
is no predominance of any particular primary hypotheses [17] 
(FRIEND, HOSTILE, UNKNOWN, NEUTRAL) or secondary 
hypotheses [17] (ASSUMED FRIEND, SUSPECT, FAKER, 
JOKER). That means that the gathered data are not reliable 
and and the model adequacy has not been proven. Therefore 
the absolute version of the rule (4) should be chosen. 


Table III 
EXAMPLE 2: BBA BEFORE AND AFTER CONDITIONING OPERATION 















































Threat \ bba m maaClEF) 
F 0.1 0.56 
H 0.1 0 
U 0.1 0 
A=FNU 0.06 0.16 
S=HnNU 0.06 0 
K=FOH 0.06 0.16 
J=FOHNU 0.06 0.12 
N 0.06 0 
I=FUHUUUN 0.4 0 








In the last example it is suggested to consider bba given 
in the second column of the Table IV. In this case the biggest 
mass has also been assigned to total ignorance, which proves 
relatively low sensor reliability. However, except m/(J), there 
is a predominance of FRIEND hypothesis over the other hy- 
potheses. Thus the model be regarded as adequate. Therefore 
the absolute version of the rule (7) should be chosen. 

In the presented procedure of selection of the conditioning 



































Table IV 
EXAMPLE 3: BBA BEFORE AND AFTER CONDITIONING OPERATION 
Threat \ bba m mmaClEF) 
F 0.16 0.422 
H 0.1 0 
U 0.1 0 
A=FNU 0 0.1 
S=HnU 0.06 0 
K=FOH 0.06 0.259 
J=FOHNU 0.06 0.219 
N 0.06 0 
I=FUHUUUN 0.4 0 




















rules bba provides qualitative information on data reliability 
as well as model adequacy. Analyzing the above examples, 
some harsh reader could regard reasoning about the adequacy 
of the model based on the bba as vague, due to the fact bbas 
are affected with measuring errors, and it is possible these 
errors influence on the decision whether a particular model 
is adequate or not. However, it is important to notice that 
in real systems these bbas are updated regularly, which enables 
to improve statistically the reference for decision making. 
That means that any predominance of a certain hypothesis 
may be confirmed by the subsequent version of updated bba. 

It is also a matter of convention how to deal with a particular 
case when m(Q) = m(J) is the maximal mass in the bba. 
Assuming that the condition hypothesis does not refer to total 
ignorance: On one hand, since bba influences both data relia- 
bility and model adequacy it is justified to select the absolute 
version of the rule (4). On the other hand, it is reasonable 
to exclude the total ignorance hypothesis m(©) = m(J) 
while deciding about the adequacy of the model, in order to 
distinguish two aspects (qualitative features) of the gathered 
bba, which is preferable by the authors. 


VI. CONCLUSION 


The introduced new rules of conditioning have been in- 
vented as a response for problems emerging while applying 
the existing absolute conditioning techniques in the real world. 
Considering the condition as identical with the ground truth 
may be useful in theory, however in practice it often performs 
an assumption hard to accept [11]. Updating attribute fusion 
results with evidence from the external system is an excellent 
example for that. Each time the highly processed information 
is used, no matter how good the system is, there is a risk that 
the output information is corrupted or at least slightly changed 
[18], [19]. 

The presented conditioning rules enable to set weights 
in order to define the degree of belief in the external system 
output. These weights should be treated as tactical and tech- 
nical parameters of the system performing combination and 
conditioning. Certainly, depending on the actual needs, they 
may be fixed or changeable dynamically. However the exact 
values should result from the particular system configuration 
thus no theoretical preference is made. 
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In case of choosing a particular rule of conditioning it is 
different, and some general guidelines may be established. 
The proposed method of selection of the conditioning rules 
may be applied in Command and Control systems, where 
multiple rules may be implemented. In such case the choice 
of the proper conditioning rule may perform an element 
of so called Conditioning Management. 
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Abstract: The identification or authentication from the handwritten signature is the most 
accepted biometric modality for identifying a person. However, a single handwritten signature 
verification (HSV) system does not allow achieving the required performances. Therefore, 
rather than trying to optimize a single HSV system by choosing the best features or classifier for 
a given system, researchers found more interesting to combine different systems. In that case, 
the DSmT is reported as very useful and powerful theoretical tool for enhancing the 
performance of multimodal biometric systems. Hence, we propose in this chapter a study of 
applying the DSmT for combining different HSV systems. Two cases are addressed for 
validating the effective use of the DSmT. The first one is to enhance the performance of off-line 
HSV systems by associating features based on Radon and Ridgelet transforms for each 
individual system. The second one is associating off-line image and dynamic information in 
order to improve the performance of single-source biometric systems and ensure greater 
security. Experimental results conducted on standard datasets show the effective use of the 
proposed DSmT based combination for improving the verification accuracy comparatively to 
individual systems. 


1.1 Introduction 


Biometrics is one of the most widely used approaches for identification and authentication 
of persons [1]. Hence, several biometric modalities have been proposed in the last decades, 
which are based on physiological and behavioral characteristics depending on their nature. 
Physiological characteristics are related to anatomical properties of a person, and include for 
instance fingerprint, face, iris and hand geometry. Behavioral characteristics refer to how a 
person performs an action, and include typically voice, signature and gait [1, 2]. Therefore, 
the choice of a biometric modality depends on several factors such as nonuniversality, 
nonpermanence, intraclass variations, poor image quality, noisy data, and matcher limitations 
[1, 3]. Thus, recognition based on unimodal biometric systems is not always reliable. To 
address these limitations, various works have been proposed for combining two or more 
biometric modalities in order to enhance the recognition performance [3, 4, 5]. This 
combination can be performed at data, feature, match score, and decision levels [3, 4]. 
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However, with the existence of the constraints corresponding to the joint use of classifiers 
and methods of generating features, an appropriate operating method using mathematical 
approaches is needed, which takes into account two notions: uncertainty and imprecision of 
the classifier responses. In general, the most theoretical advances which have been devoted to 
the theory of probabilities are able to represent the uncertain knowledge but are unable to 
model easily the information that is imprecise, incomplete, or not totally reliable. Moreover, 
they often lead to confuse both concepts of uncertainty and imprecision with the probability 
measure. Therefore, new original theories dealing with uncertainty and imprecise information 
have been introduced, such as the fuzzy set theory [6], evidence theory [7], possibility theory 
[8] and, more recently, the theory of plausible and paradoxical reasoning developed by 
Dezert-Smarandache theory (DSmT) [9, 10, 11]. The DSmT has been elaborated by Jean 
Dezert and Florentin Smarandache for dealing with imprecise, uncertain and paradoxical 
sources of information. Thus, the main objective of the DSmT is to provide combination rules 
that would allow to correctly combine evidences issued from different information sources, 
even in presence of conflicts between sources or in presence of constraints corresponding to 
an appropriate model (i.e. free or hybrid DSm models [9]). 


The use of the DSmT has been used justified in many kinds of applications [9, 10, 11]. 
Indeed, the DSmT is reported as very useful and powerful theoretical tool for enhancing the 
performance of multimodal biometric systems. Hence, combination algorithms based on 
DSmT have been used by Singh et al. [12] for robust face recognition through integrating 
multilevel image fusion and match score fusion of visible and infrared face images. Vatsa et al. 
proposed a DSmT based fusion algorithm [13] to efficiently combine level-2 and level-3 
fingerprint features by incorporating image quality. Vatsa et al. proposed an unification of 
evidence-theoretic fusion algorithms [14] applied for fingerprint verification using level-2 and 
level-3 features. A DSmT based dynamic reconciliation scheme for fusion rule selection [15] 
has been proposed in order to manage the diversity of scenarios encountered in the probe 
dataset. 


Generally, the handwritten signature is considered as the most known modality for 
biometric applications. Indeed, it is usually socially accepted for many 
government/legal/financial transactions such as validation of checks, historical documents, etc 
[16]. Hence, an intense research field has been devoted to develop various robust verification 
systems [17] according to the acquisition mode of the signature. Thus, two modes are used for 
capturing the signature, which are off-line mode and on-line mode, respectively. The off-line 
mode allows generating a handwriting static image from the scanning document. In contrast, 
the on-line mode allows generating dynamic information such as velocity and pressure from 
pen tablets or digitizers. For both modes, many Handwritten Signature Verification (HSV) 
systems have been developed in the past decades [17, 18, 19]. Generally, the off-line HSV 
systems remains less robust compared to the on-line HSV systems [16] because of the 
absence of dynamic information of the signer. 


Generally, a HSV system is composed of three modules, which are preprocessing, feature 
generation and classification. In this context, various methods have been developed for 
improving the robustness of each individual HSV system. However, the handwritten signature 
verification failed to underline the incontestable superiority of a method over another in both 
steps of generating features and classification. Hence, rather than trying to optimize a single 
HSV system by choosing the best features for a given problem, researchers found more 
interesting to combine several classifiers [20]. 


Recently, approaches for combining classifiers have been proposed to improve signature 
verification performances, which led the development of several schemes in order to treat data 
in different ways [21]. Generally, three approaches for combining classifiers can be 
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considered: parallel approach [22, 23], sequential approach [24, 25] and hybrid approach [26], 
[27]. However, the parallel approach is considered as more simple and suitable since it allows 
exploiting the redundant and complementary nature of the responses issued from different 
signature verification systems. Hence, sets of classifiers have been used, which are based on 
global and local approaches [28, 29] and feature sets [30, 31], parameter features and function 
features [32, 33], static and dynamic features [34, 35]. Furthermore, several decision 
combination schemes have been implemented, ranging from majority voting [23, 36] to Borda 
count [37], from simple and weighted averaging [38] to Dempster-Shafer evidence theory [37, 
39] and Neural Networks [40, 41]. The boosting algorithm has been used to train and integrate 
different classifiers, for both verification of on-line [42, 43] and off-line [44] signatures. 


In this research, we follow the path of combined biometric systems by investigating the 
DSmT for combing different HSV systems. Therefore, we study the reliability of the DSmT 
for achieving a robust multiple HSV system. Two cases are considered for validating the 
effective use of the DSmT. The first one is to enhance the performance of off-line HSV 
systems by associating features based on Radon and Ridgelet transforms for each individual 
system. The second one is associating off-line image and dynamic information in order to 
improve the performance of single-source biometric systems and ensure greater security. For 
both cases, the combination is performed through the generalized biometric decision 
combination framework using Dezert-Smarandache theory (DSmT) [9, 10, 11]. 


The chapter is organized as follows. We give in Section 1.2 a review of sophisticated 
Proportional Conflict Redistribution (PCR5) rule based on DSmT. Section 1.3 describes the 
proposed verification system and Section 1.4 presents the performance criteria and datasets of 
handwritten signatures used for evaluation. Section 1.5 discuss the experimental results of the 
proposed verification system. The last section gives a summary of the proposed verification 
system and looks to the future research direction. 


1.2 Review of PCR5 combination rule 


Generally, the signature verification is formulated as a two-class problem where classes are 
associated to genuine and impostor, namely Ogen and Bimp , respectively. In the context of the 
probabilistic theory, the frame of discernment, namely ©, is composed of two elements as: 
O= (enr inp }, and a mapping function m € [0,1] is associated for each class, which 
defines the corresponding mass verifying m(@) = 0 and mM(Ogen ) + M( Gimp ) = 1, 


When combining two sources of information and so two individual systems, namely 
information sources S! and S?, respectively, the sum rule seems effective for non-conflicting 
responses [3]. In the opposite case, an alternative approach has been developed by Dezert and 
Smarandache to deal with (highly) conflicting imprecise and uncertain sources of information 
[9, 10, 11]. For two-class problem, a reference domain also called the frame of discernment 
should be defined for performing the combination, which is composed of a finite set of 
exhaustive and mutually exclusive hypotheses. Example of such approaches is PCRS rule. 


The main concept of the DSmT is to distribute unitary mass of certainty over all the 
composite propositions built from elements of © with U (Union) and N (Intersection) 
operators instead of making this distribution over the elementary hypothesis only. Therefore, 
the hyper-powerset D® is defined as DÈ? = {Ø, Ogen, Oimp» Ogen U Gimp» Ogen N Gimp }. The 
DSmT uses the generalized basic belief mass, also known as the generalized basic belief 
assignment (gbba) computed on hyper-powerset of @ and defined by a map m(.): DE — 
[0, 1] associated to a given source of evidence, which can support paradoxical information, as 
follows: m(@) = 0 and mM(8,en ) + M(Gimp ) + M(Ogen U Oimp ) + m(Ogen N Oimp ) = 1. The 
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combined masses Mpcrs obtained from m,(.) and m2(.) by means of the PCRS rule [10] is 
defined as: 


Mpcrs(A) = Pats (A) + Manx (A) Pach GY) 
Where 
Manx (A) = mı (A)}? Mz(X m2 (A)}? my (X 
xea (A) +m(X) m(A) + m (%) 
c(Anx)=0 


and ® ={®Dy_, Ø} is the set of all relatively and absolutely empty elements, Par is the set of 
all elements of D® which have been forced to be empty in the Shafer’s model M defined by 
the exhaustive and exclusive constraints, Ø is the empty set, and c(A N X) is the canonical 
form (conjunctive normal) of AN X and where all denominators are different to zero. If a 
denominator is zero, that fraction is discarded. Thus, the term mMpsmc(A) represents a 
conjunctive consensus, also called DSm Classic (DSmC) combination rule [9], which is 
defined as: 


0 ifA=@ 


Mpsmc (A) = (x,veD®,xnv=a) M (X)m2(X) otherwise he) 


1.3 System description 


The combined individual HSV system is depicted in Figure 1.1, which are composed of an 
off-line verification system, an on-line or off-line verification system and a combination 
module. s; and sz define the off-line and on-line or off-line handwritten signatures provided 
by two sources of information S1 and S?, respectively. Each individual verification system is 
generally composed of three modules: pre-processing, feature generation and classification. 


Signature s 
OFF-LINE ON-LINE or OFF-LINE 
ACQUISITION ACQUISITION 
5 52 


Pre- 
processing 


Feature 
Generation 


SVM 
Classifier 









Pre- 
processing 


Feature 
Generation 


Classifier 









OFF-LINE VERIFICATION SYSTEM 
SYSTEM 





ON-LINE or OFF-LINE VERIFICATION 


COMBINATION 


Accepted 
or Rejected 


Figure 1.1: Structure of the combined individual HSV systems. 


426 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


1.3.1 Pre-processing 


According the acquisition mode, each handwritten signature is pre-processed for 
facilitating the feature generation. Hence, the pre-processing of the off-line signature includes 
two steps: Binarization using the local iterative method [45] and elimination of the useless 
information around the signature image without unifying its size. The pre-processing steps of 
a signature example are shown in Figure 1.2. The binarization method was chosen to capture 
signature from the background. It takes the advantages of locally adaptive binarization 
methods [45] and adapts them to produce an algorithm that thresholds signatures in a more 
controlled manner. By doing this, the local iterative method limits the amount of noise 
generated, as well as it attempts to reconstruct sections of the signature that are disjointed. 


ay uoh, 


(a) 


Sp [Sy 


(b) (9) 


Figure 1.2: Preprocessing steps: (a) Scanning (b) Binarization 
(c) Elimination of the useless information. 


While the on-line signature, no specific pre-processing is required. More details on the 
acquisition method and pre-processing module of the on-line signatures are provided in Refs. 
[46] and [47]. 


1.3.2 Feature generation 


Features are generated according the acquisition mode. In the combined individual HSV 
systems, we use the uniform grid, Radon and Ridgelet transforms for off-line signatures and 
dynamic characteristics for on-line signatures, respectively. 


a. Features used for combining individual off-line HSV systems 


The first case study for evaluating the performance of the proposed combination using 
DSmT is performed with two individual off-line HSV systems. Features are generated from 
the same off-line signature using the Radon and Ridgelet transforms. The Radon transform is 
well adapted for detecting linear features. In contrast, the Ridgelet transform allows 
representing linear singularities [48]. Therefore, Radon and Ridgelet coefficients provide 
complementary information about the signature. 


e Radon transform based features: The Radon transform of each off-line signature is 
calculated by setting the respective number of projection points N, and orientations Ng, 
which define the length of the radial and angular vectors, respectively. Hence, a radon 
matrix is obtained having a size [N, x Ng | which provides in each point the cumulative 
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intensity of pixels forming the image of the off-line signature. Figure 1.3 shows an 
example of a binarized image of an off-line signature and the steps involved for 
generating features based on Radon transform. Since the Radon transform is redundant, 
we take into account only positive radial points }} N, /2} x Nọ}. Then after, for each 
angular direction, the energy of Radon coefficients is computed to form the feature 
vector x, of dimension }1 x Nog}. This energy is defined as: 


Ege = Tea (8 48 © (1,2, No} (ka 


where T,aa is the Radon transform operator. 





Binarized image Radon image Radon image 
r without redundancy Feature vector 
a 2 eee) Ered 
* © f ; r ; . 
o = | H 
wo —> T —_> 
3 gol ! a 
= 0 180 360 EN, 
Angular axis 
>f ‘ 
0 180 360 
Angular axis 


Figure 1.3: Steps for generating the feature vector from the Radon transform. 


Ridgelet transform based features: For generating complementary information of the 
Radon features, the wavelet transform (WT) is performed along the radial axis allowing 
generating the Ridgelet coefficients [49]. Figure 1.4 shows an example for generating 
the feature vector from the Ridgelet transform. For each angular direction, the energy 
of Ridgelet coefficients is computed taking into account only details issued from the 
decomposition level L of the WT. Therefore, the different values of energy are finally 
stored in a vector x, of dimension }1 X Ng}. This energy is defined as: 


2 N,/2 i 
ie rat Eg"? =}T,” ata, b, 0}, 0 €}1, 2, ..., No} (1.4) 


where T,ia is the Ridgelet transform operator whereas a and b are the scaling and 
translation factors of the WT, respectively. 
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Figure 1.4: Steps for generating the feature vector from the Ridgelet transform. 


b. Features used for combining individual off-line and on-line HSV systems 


The second case study is considering for evaluating the performance of the proposed 
DSmT for combining both individual off-line and on-line HSV systems. Features are 
generated from both off-line and on-line signatures of the same user using the uniform grid 
(UG) and dynamic characteristics, respectively. The UG allows extracting locally features 
without normalization of the off-line signature image. On each grid, the densities are 
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computed providing overall signature appearance information. In contrast, dynamic 
characteristics computed from the on-line signature allow providing complementary dynamic 
information in the combination process. 


e Uniform grid based features: Features are generated using the Uniform Grid (UG) 
[50, 51], which consists to create n X m rectangular regions for sampling. Each region 
has the same size and shape. Parameters n and m define the number of lines (vertical 
regions) and columns (horizontal regions) of the grid, respectively. Hence, the feature 
associated to each region is defined as the ratio of the number of pixels belonging to the 
signature and the total number of pixels of images. Therefore, the different values are 
finally stored in a vector x, of dimension n x m, which characterizes the off-line 
signature image. 

Figure 1.5 showsa3 x5 grid, which allows an important reductionrepfeskatation 
vector, but it preserves wrongly the visual information. In contrast, a 15 x 30 grid which 
provides an accurate representation of images, but it leads a larger characteristic vector. 
A 5 x 9 grid seems to be an optimal choice between the quality ofrepresentation and 
dimensionality. Thus, the optimal choice of the grid size for all writers is obviously too 
important to effectively solve our problem of signature verification. In our case, for all 
experiments, the parameters n andm of are fixed to 5 and 9, respectively. 





[3x5] [5x9] [15x30] 


Figure 1.5: Visualization of different grid sizes. 


e Dynamic information based features: For the individual on-line verification system, 
features are generated using only the dynamic features. Each on-line signature is 
represented by a vector x2 composed of 11 features, which are signature total duration, 
average velocity, vertical average velocity, horizontal average velocity, maximal 
velocity, average acceleration, maximal acceleration, variance of pressure, mean of 
azimuth angle, variance of azimuth angle and mean of elevation angle. A complete 
description of the feature set is shown in Table 1.1. 


1.3.3 Classification based on SVM 
a. Review of SVMs 


The classification based on Support Vector Machines (SVMs) has been widely used in 
many pattern recognition applications as the handwritten signature verification [35, 52]. The 
SVM is a learning method introduced by Vapnik et al. [53], which tries to find an optimal 
hyperplane for separating two classes. Its concept is based on the maximization of the 
distance of two points belonging each one to a class. Therefore, the misclassification error of 
data both in the training set and test set is minimized. 
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Table 1.1: Set of dynamic features. s =)Pt4, Pt, ..., Ptn) denotes an on-line signature 
composed of n events Pti) Xi, Yi ti), Xi, ¥i,P1r;,Az;, Al; denote x-position, y-position, 
pen pressure, azimuth and elevation angles of the pen at the i time instant t;, 
respectively. 


Basically, SVMs have been defined for separating linearly two classes. When data are non 
linearly separable, a kernel function is used. Thus, all mathematical functions, which satisfy 
Mercer’s conditions, are eligible to be a SVM-kernel [53]. Examples of such kernels are 
sigmoid kernel, polynomial kernel, and Radial Basis Function (RBF) kernel. Generally, the 
RBF kernel is used for its better performance, which is defined as: 


x—xpll? 
202 





K(x, Xk) = exp (- (1.5) 


Where o is the kernel parameter, ||x — x;|| is the Euclidian distance between two samples. 
Therefore, the decision function f: R? > {—1, +1}, is expressed in terms of kernel expansion 
as: 


fx) = Rea agyy, K(x, Xk) + b (1.6) 


where œg are Lagrange multipliers, Sv is the number of support vectors x, which are training 
data, such that 0 < a, < C, C is a user-defined parameter that controls the tradeoff between 
the machine complexity and the number of nonseparable points [54], the bias b is a scalar 
computed by using any support vector. Finally, test data xy, d = {1,2}, are classified 
according to: 


class (+1) 


if f (xq) >0 
Xa class (—1) 


otherwise 


(1.7) 


b. Decision rule 


The direct use of SVMs does not allow defining a decision threshold to assign a signature 
to genuine or forgery classes. Therefore, outputs of SVM are transformed to objective 
evidences, which express the membership degree (MD) of a signature to both classes (genuine 
or forgery). In practice, the MD has no standard form. However, the only constraint is that it 
must be limited in the range of [0, 1] whereas SVMs produce a single output. In this chapter, 
we use a fuzzy model which has been proposed in [50, 51, 55] to assign MD for SVM output 
in both genuine and impostor classes. Let f (xq) be the output of a SVM obtained for a given 
signature to be classified. The respective membership degrees h,(6;),i = {gen, imp} 
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associated to genuine and impostor classes are defined according to membership models 
given in the Algorithm 1 [51]. To compute the values of membership degrees Jha ,d=1, 2), 
we consider the two case studies as follows: 


e in first case study, the main problem for generating features is the appropriate number 
of the angular direction Ng for the Radon transform and the number of the 
decomposition level L of the WT (Haar Wavelet) in the Ridgelet domain. Hence, many 
experiments are conducted for finding the optimal values for which the error rate in the 
training phase is null. In this case, feature vectors are generated from both Radon 
(d = 1) and Ridgelet (d = 2) of the same off-line signature by setting Ng and L to 32 
and 3, respectively. 


in second case study, we calculate the values (ha, d = 1) of off-line signature by using 
the optimal size [5 x 9] of the grid for which the error rate in the training phase is null. 
In the same way, we calculate also the values (hq, d = 2) of on-line signature by using 
the vector of 11 dynamic features for which the error rate in the training phase is null. 


Algorithm 1. 





Respective membership models for two classes. 


if f (xa) > 1 then 
Na (Oyen) = 1 


ha (imp) = 0 
else 
if f (xa) < —1 then 
ha(Ozen) — 0 
ha (Pimp ) = 1 
else 


1 
ha (gen) @ 


ha (Gimp ) = 
end if 
end if 


Lf a) 
2 





Hence, a decision rule is performed about whether the signature is genuine or forgery as 
described in Algorithm 2. 





Algorithm 2. Decision making in SVM framework. 


if ha (Oyen) > t then 
ha (Oimp ) 
Sq E Ogen 
else 
Sq E Oimp 
end if 





Where t defines a decision threshold. 
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1.3.4 Classification based on DSmT 


The proposed combination module consists of three steps: 1) transform membership 
degrees of the SVM outputs into belief assignments using estimation technique based on the 
dissonant model of Appriou, ii) combine masses through a DSmT based combination rule and 
iii) make a decision for accepting or rejecting a signature. 


a. Estimation of masses 


In this chapter, the mass functions are estimated using a dissonant model of Appriou, 
which is defined for two classes [56]. Therefore, the extended version of Appriou’s model in 
DSmT framework is given as: 





mia(O) = 0 (1.8) 

Mig (O;) = Sfara (1.9) 
ma = os (1.10) 
mia (9; U 8i) = Bia (1.11) 
Mia (8; N 8) = 0 (1.12) 


where i = {gen,imp}, hg(6;) is the membership degree of a signature provided by the 
corresponding source St (d = 1,2), (1 — Bia) is a confidence factor of i-th class, and Big 
defines the error provided by each source (d = 1,2) for each class 6;. In our approach, we 
consider piq as the verification accuracy prior computed on the training database for each 
class [14]. Since both SVM models have been validated on the basis that errors during 
training phase are zero, therefore fiq is fixed to 0.001 in the estimation model. 

Note that the same information source cannot provide two responses, simultaneously. 


Hence, in DSmT framework, we consider that the paradoxical hypothesis 6; N 6}; has no 
physical sense towards the two information sources Ogen and Oimp . Therefore, the beliefs 


assigned to this hypothesis are null as given in Equation (1.12). 


b. Combination of masses 
The combined masses are computed in two steps. First, the belief assignments (mj,q(.),i = 
{gen, imp}) are combined for generating the belief assignments for each source as follows: 


mı = Megen}1 p Mtimp }1 (1.13) 

m = Migen}2 p Mimp }2 (1.14) 
where © represents the conjunctive consensus of the DSmC rule. 
Finally, the belief assignments for the combined sources (m,(.),d = 1, 2) are then computed 
as: 

m,.=m,@ m, (1.15) 

where @ represents the combination operator, which is composed of both conjunctive and 
redistribution terms of the PCRS rule. 
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c. Decision rule 


A decision for accepting or rejecting a signature is made using the statistical classification 
technique. First, the combined beliefs are converted into probability measure using a 
probabilistic transformation, called Dezert-Smarandache probability (DSmP), that maps a 
belief measure to a subjective probability measure [11] defined as: 


DSmPF.(6;) = m,(6;) + (m; (8:) + E)wae (1.16) 


where wy, is a weighting factor defined as: 


Spee me(A;) 
ee ee ee 
TEJO 4,€2° Me(Ak) + € CmlA 
jS Ak CX 
Abi  CulAr)=1 
Cm(A;j)22 


such that is a tuning parameter, M is the Shafer’s model for ©, and C ‘A x) denotes the DSm 
cardinal [11] of the set A, . Therefore, the likelihood ratio test is performed for decision 
making as described in Algorithm 3. 





Algorithm 3. Decision making in DSmT framework. 


i DsmP. (pen) > t then 
DSMP. (Oimp ) 
Sq E Ogen 
else 
Sq E Gimp 
end if 





Where t defines a decision threshold and s = {s1, S2} is the j-th signature represented by two 
modalities according the case study as follows: 


e in first case study, s is an off-line signature characterized by both Radon and Ridgelet 
features. 

e in second case study, s is a signature represented by both off-line and on-line 
modalities. 


1.4 Performance criteria and dataset description 


In this section, we briefly describe datasets used and performance criteria for evaluating 
the proposed DSmT for combing individual handwritten signature verification systems. 


1.4.1 Dataset description 


To evaluate the verification performance of the proposed DSmT based combination of 
individual HSV systems, we use two datasets of handwritten signatures: (1) CEDAR signature 
dataset [57] used for evaluating the performance for combining individual off-line HSV 
systems and (2) NISDCC signature dataset [58] for the experiments related to the 
simultaneous verification of individual off-line and on-line HSV systems. 
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a. CEDAR signature database 


The Center of Excellence for Document Analysis and Recognition (CEDAR) signature 
dataset [57] is a commonly used for off-line signature verification. The CEDAR dataset 
consists of 55 signature sets, each one being composed by one writer. Each writer provided 24 
samples of their signature, where these samples constitute the genuine portion of the dataset. 
While, forgeries are obtained by asking arbitrary people to skillfully forge the signatures of 
the previously mentioned writers. In this fashion, 24 forgery samples are collected per writer 
from about 20 skillful forgers. In total, this dataset contains 2640 signatures, built from 1320 
genuine signatures and 1320 skilled forgeries. Figures 1.6(a) and 1.6(b) show two examples 
of both preprocessed genuine and forgery signatures for one writer, respectively. 





(a) Genuine signatures. (b) Forgery signatures. 
Figure 1.6: Signature samples of the CEDAR. 


b. NISDCC signature database 


The Norwegian Information Security laboratory and Donders Centre for Cognition 
(NISDCC) signature dataset has been used in the ICDAR’09 signature verification 
competition [58]. This collection contains simultaneously acquired on-line and off-line 
samples. The off-line dataset is called —NISDCffline” and contains only static information 
while the on-line dataset which is called —NISDCOnline” also contains dynamic 
information, which refers to the recorded temporal movement of handwriting process. Thus, 
the acquired on-line signature is available under form of a subsequent sampled trajectory 
points. Each point is acquired at 200 Hz on tablet and contains five recorded pen-tip 
coordinates: x-position, y-position, pen pressure, azimuth and elevation angles of the pen [59] 
The NISDCC-offline dataset is composed of 1920 images from 12 authentic writers (5 
authentic signatures per writer) and 31 forging writers (5 forgeries per authentic signature). 
Figures 1.7(a) and 1.7(b) show an example of both preprocessed off-line signature and a 
plotted matching on-line signature for one writer, respectively. 











Figure 1.7: Signature samples of the NISDCC signature collection. 


1.4.1 Performance criteria 


For evaluating performances of the combined individual HSV systems, three different 
kinds of error are considered: False Accepted Rate (FAR) allows taking into account only 
skilled forgeries; False Rejected Rate (FRR) allows taking into account only genuine 
signatures and finally the Half Total Error Rate (HTER) allows taking into account both rates. 
Thus, Equal Error Rate is a special case of HTER when FRR = FAR. 
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1.4.2 SVM model 


For both case studies, signature data are split into training and testing sets for evaluating 
the performances of the proposed DSmT based combination of individual HSV systems. 
Thus, the training phase allows finding the optimal hyperparameters for each individual SVM 
model. In our system, the RBF kernel is selected for the experiments. 


a. SVM models used for combined individual off-line HSV systems 


In first case study, the SVM model is produced for each individual off-line HSV system 
according the Radon and Ridgelet features, respectively. For each writer, 2/3 and 1/3 samples 
are used for training and testing, respectively. The optimal parameters (C, o) of each SVM are 
tuned experimentally, which are fixed as (C=19.1,o=4) and (C =15.1,0 = 4.6), 
respectively. 


b. SVM models used for combined individual off-line and on-line HSV systems 


In second case study, the SVM model is produced for both individual off-line and on-line 
HSV systems according the uniform grid features and dynamic information, respectively. For 
each writer and both datasets, 2/3 and 1/3 samples are used for training and testing, 
respectively. The optimal parameters (C,o) for both SVM classifiers (off-line and on-line) 
are tuned experimentally, which are fixed as (C = 9.1,0 = 9.4) and (C = 13.1,0 = 2.2), 
respectively. 


1.5 Experimental results and discussion 


For each case study, decision making will be only done on the simple classes. Hence, we 
consider the masses associated to all classes belonging to the hyper power set D? = 
{9, Ozen» 9imp » gen U Pimp» Ogen N Pimp } in both combination process and decision making. 
In the context of signature verification, we take as constraint the proposition that Ogen N 
Oimp = Ø in order to separate between genuine and impostor classes. Therefore, the hyper 
power set D? is simplified to the power set 2° as 2° = {Ø, Ogen» Oimp» Ogen U Gimp }, which 
defines the Shafer’s model [9]. This section presents the experimental results with their 
discussion. 


To evaluate the performance of the proposed DSmT based combination, we use two 
individual off-line HSV systems using the CEDAR database at the first case study. Indeed, 
the task of the proposed combination module is to manage the conflicts generated between the 
two individual off-line HSV systems for each signature using the PCR5 combination rule. For 
that, we compute the verification errors of both individual off-line HSV systems and the 
combined individual off-line HSV systems using PCRS rule. Figure 1.8 shows the FRR and 
FAR computed for different values of decision threshold using both individual off-line HSV 
systems of this first case study. Table 1.2 shows the verification errors rates computed for the 
corresponding optimal values of decision threshold of this case study. Here HSV system 1 is 
the individual off-line verification system feeded by Radon features that yields an error rate of 
7.72% corresponding to the optimal value of threshold t = 1.11 while HSV system 2 is the 
individual off-line verification system feeded by Ridgelet features, which provides the same 
result with an optimal value of threshold t = 0.991. Consequently, both individual off-line 
HSV systems give the same verification performance since the corresponding error rate of 
HTER = 7.72% is the same. 
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The proposed DSmT based combination of individual off-line HSV systems yields a HTER 
of 5.45% corresponding to the optimal threshold value t = 0.986. Hence, the combined 
individual off-line HSV systems with PCRS rule allows improving the verification 
performance by 2.27%. This is due to the efficient redistribution of the partial conflicting 
mass only to the elements involved in the partial conflict. 

















FRR] 4 90+ = FRR] 
— FAR FAR 























Error rate (%) 
Error rate (%) 


























1 1 
10" 10° 10" 10° 10° 
Threshold value Threshold value 


Figure 1.8: Performance evaluation of the individual off-line HSV systems. 




















Optimal 
HSV Systems Threshold FAR | FRR | HTER 
System 1 1.110 7.12 7.72 7.12 
System 2 0.991 7.12 7.712 7.12 
Combined Systems 0.986 5.45 5.45 5.45 

















Table 1.2: Error rates (%) obtained for individual and combined HSV systems. 


In the second case study, two sources of information are combined through the PCRS rule. 
Figure 1.9 shows three examples of conflict measured between off-line and on-line signatures 
for writers 3, 7, and 10 of the NISDCC dataset, respectively. The values K,3 )€ 
)0.00, 0.35)), K.7 )€ )0.00, 0.64)), and K.49 )€ )0.00, 1.00)) represent the mass assigned to 
the empty set, after combination. We can see that the two sources of information are very 
conflicting. Hence, the task of the proposed combination module is to manage the conflicts 
generated from both sources )K,,,w = 1,2,...,12) for each signature using the PCRS5 
combination rule. For that, we compute the verification errors of both individual off-line and 
on-line HSV systems and the proposed DSmT based combination. Figure 1.10 shows the FRR 
and FAR computed for different values of decision threshold using both individual off-line and 
on-line HSV systems of this second case study. For better comparison, Table 1.3 shows the 
HTER computed for the corresponding optimal values of decision threshold of this case study. 


The proposed DSmT based combination of both individual off-line and on-line HSV 
systems yields a HTER of 0% corresponding to the optimal threshold value t = 0.597. 
Consequently, the proposed combination of individual off-line and on-line HSV systems 
using PCRS rule yields the best verification accuracy compared to the individual off-line and 
on-line HSV systems, which provide conflicting and complementary outputs. 
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Figure 1.9: Conflict between off-line and on-line signatures 


for the writers 3, 7, and 10, respectively. 
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(a) Off-line HSV system 1. (b) On-line HSV system 2. 


Figure 1.10: Performance evaluation of the individual off-line and on-line HSV systems. 

















Optimal 
HSV Systems Threshold FAR | FRR | HTER 
System 1 0.012 12.44 | 12.50 12.47 
System 2 0.195 0.98 0.00 0.49 
Combined Systems 0.597 0.00 0.00 0.00 




















Table 1.3: Error rates (%) obtained for individual and combined HSV systems. 


1.6 Conclusion 


This chapter proposed and presented a new system based on DSmT for combining 
different individual HSV systems which provide conflicting results. The individual HSV 
systems are combined through DSmT using the estimation technique based on the dissonant 
model of Appriou, sophisticated PCRS rule and likelihood ratio test. Hence, two cases have 
been addressed in order to ensure a greater security: (1) combining two individual off-line 
HSV systems by associating Radon and Ridgelet features of the same off-line signature (2) 
and combining both individual off-line and on-line HSV systems by associating static image 
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and dynamic information of the same signature characterized by off-line and on-line 
modalities. Experimental results show in both case studies that the proposed system using 
PCRS rule allows improving the verification errors compared to the individual HSV systems. 


As remark, although the DSmT allows improving the verification accuracy in both studied 
cases, it is clearly that the achieved improvement depends also to the complementary outputs 
provided by the individual HSV systems. Indeed, according to the second case study, a 
suitable performance quality on the individual on-line HSV system can be obtained when the 
dynamic features of on-line signatures are carefully chosen. Combined to the grid features 
using DSmT allows providing more powerful system comparatively to the system of the first 
case study in term of success ratio. In continuation to the present work, the next objectives 
consist to explore other alternative DSmT based combinations of HSV systems in order to 
attempt improving performance quality of the writer-independent HSV whether the signature 
is genuine or forgery as well as in the false rejection and false acceptance concepts. 
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Automatic Aircraft Recognition using DSmT and HMM 


Xin-de Li 
Jin-dong Pan 
Jean Dezert 


Abstract—In this paper we propose a new method for solving 
the Automatic Aircraft Recognition (AAR) problem from a 
sequence of images of an unknown observed aircraft. Our method 
exploits the knowledge extracted from a training image data set 
(a set of binary images of different aircrafts observed under three 
different poses) with the fusion of information of multiple features 
drawn from the image sequence using Dezert-Smarandache 
Theory (DSmT) coupled with Hidden Markov Models (HMM). 
The first step of the method consists for each image of the 
observed aircraft to compute both Hu’s moment invariants (the 
first features vector) and the partial singular values of the outline 
of the aircraft (the second features vector). In the second step, 
we use a probabilistic neural network (PNN) based on the 
training image dataset to construct the conditional basic belief 
assignments (BBA’s) of the unknown aircraft type within the set 
of a predefined possible target types given the features vectors 
and pose condition. The BBA’s are then combined altogether by 
the Proportional Conflict Redistribution rule #5 (PCR5) of DSmT 
to get a global BBA about the target type under a given pose 
hypothesis. These sequential BBA’s give initial recognition results 
that feed a HMM-based classifier for automatically recognizing 
the aircraft in a multiple poses context. The last part of this 
paper shows the effectiveness of this new Sequential Multiple- 
Features Automatic Target Recognition (SMF-ATR) method with 
realistic simulation results. This method is compliant with real- 
time processing requirement for advanced AAR systems. 


Keywords: Information fusion; DSmT; ATR; HMM. 


I. INTRODUCTION 


ATR (Automatic Target Recognition) systems play a ma- 
jor role in modern battlefield for automatic monitoring and 
detection, identification and for precision guided weapon as 
well. The Automatic Aircraft Recognition (AAR) problem is 
a subclass of the ATR problem. Many scholars have made 
extensive explorations for solving ATR and AAR problems. 
The ATR method is usually based on target recognition using 
template matching [1], [2] and single feature (SF) extraction 
[3]-[7] algorithms. Unfortunately, erroneous recognition often 
occurs when utilizing target recognition algorithms based on 
single feature only, specially if there exist important changes in 
pose and appearance of aircrafts during flight path in the image 
sequence. In such condition, the informational content drawn 
from single feature measures cannot help enough to make a 
reliable classification. To overcome this serious drawback, new 
ATR algorithms based on multiple features (MF) and fusion 
techniques have been proposed [8]—[12]. An interesting MF- 
ATR algorithm based on Back-Propagation Neural Network 
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(BP-NN), and Dempster-Shafer Theory (DST) of evidence 
[23] has been proposed by Yang et al. in [11] which has been 
partly the source of inspiration to develop our new improved 
sequential MF-ATR method presented here and introduced 
briefly in [12] (in chinese). In this paper we will explain in 
details how our new SMF-ATR method works and we evaluate 
its performances on a typical real image sequence. 

Although MF-ATR approach reduces the deficiency of SF- 
ATR approach in general, the recognition results can some- 
times still be indeterminate form a single image exploitation 
because the pose and appearance of different kinds of air- 
crafts can be very similar for some instantaneous poses and 
appearances. To eliminate (or reduce) uncertainty and improve 
the classification, it is necessary to exploit a sequence of 
images of the observed aircraft during its flight and devel- 
op efficient techniques of sequential information fusion for 
advanced (sequential) MF-ATR systems. Two pioneer works 
on sequential ATR algorithms using belief functions (BF) 
have been proposed in last years. In 2006, Huang et al. in 
[13] have developed a sequential ATR based on BF, Hu’s 
moment invariants (for image features vector), a BP-NN for 
pattern classification, and a modified Dempster-Shafer (DS) 
fusion rule’. A SF-ATR approach using BF, Hu’s moment 
invariants, BP-NN and DSmT rule has also been proposed 
in [14] the same year. In these papers, the authors did clearly 
show the benefit of the integration of temporal SF measures 
for the target recognition, but the performances obtained were 
still limited because of large possible changes in poses and 
appearances of observed aircrafts (specially in high maneuver 
modes as far as military aircrafts are under concern). The 
purpose of this paper is to develop a new (sequential) MF-ATR 
method able to provide a high recognition rate with a good 
robustness when face to large changes of poses and ppearances 
of observed aircraft during its flight. 

The general principle of our SMF-ATR method is shown on 
Fig.1. The upper part of Fig. 1 consists in Steps 1 & 2, whereas 
the lower part of Fig. 1 consists in Steps 3 & 4 respectively 
described as follows: 


e Step 1 (Features extraction) : We consider and extract 
only two features vectors in this work? (Hu’s moment 


‘called the abortion method by the authors. 
?The introduction of extra features is possible and under investigations. 
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Fig. 1: General principle of our sequential MF-ATR approach. 


invariants vector, and Singular Values Decomposition 
(SVD) features vector) from the binary images? 

e Step 2 (BBA’s construction*) : For every image in the se- 
quence and from their two features vectors, two Bayesian 
BBA’s on possible (target type,target pose) are computed 
from the results of two PNN’s trained on the image 
dataset. The method of BBA construction is different 
from the one proposed in [12]. 

e Step 3 (BBA’s combination) : For every image, say the 
k-th image, in the sequence, the two BBA’s of Step 2 
are combined with the PCRS fusion rule, from which a 
decision Ox on the most likely target type and pose is 


drawn. 
e Step 4 (HMM-based classifier) : From the sequence 
OK = {O,,...,On...,Ox} of K local decisions com- 


puted at Step 3, we feed several HMM-based classifiers 
in parallel (each HMM characterizes each target type) 
and we find finally the most likely target observed in the 
image sequence which gives the output of our SMF-ATR 
approach. 


The next section presents each step of this new SMF-ATR 
approach. Section 3 evaluates the performances of this new 
method on real image datasets. Conclusions and perspectives 
of this work are given in Section 4. 


II. THE SEQUENTIAL MF-ATR APPROACH 


In this section we present the aforementioned steps neces- 
sary for the implementation of our new SMF-ATR method. 


3In this work, we use only with binary images because our image training 
dataset contains only binary images with clean backgrounds, and working 
with binary images is easier to do and requires less computational burden 
than working with grey-level or color images. Hence it helps to satisfy real- 
time processing. The binarization of the images of the sequence under analysis 
is done with the the Flood Fill Method explained in details in [22] using the 
point of the background as a seed for the method. 

“The mathematical definition of a BBA is given in Section II-C. 


A. Step 1: Features extraction from binary image 


Because Aircraft poses in a flight can vary greatly, we need 
image features that are stable and remain unchanged under 
translation, rotation and scaling. In terms of aircraft features, 
two categories are widely used: 1) moment features and 2) 
contour features. Image moments have been widely used since 
a long time specially for pattern-recognition applications [16]. 
Moment features which are the descriptions of image regional 
characteristics are mainly obtained from the intensity of each 
pixel of target image. Contour features are extracted primarily 
by discretizing the outline contour and they describe the 
characteristic of the outline of the object in the image. In terms 
of moment features, Hu’s moment invariants [6] are used here. 
As contour features, we use the SVD [15] of outlines extracted 
from the binary images. 


e Hu’s moments 
Two-dimensional (p + q)-th order moments for p,q = 
0,1,2,... of an image of size M x N are defined as follows: 


M N 
A 
Mpg = 5 MPnIf(m,n) (1) 
m=l1n=1 
where f (m,n) is the value of the pixel (m,n) of the binary 
image. Note that m,, may not be invariant when f(m,n) by 
translation, rotating or scaling. The invariant features can be 


obtained using the (p + q)-th order central moments pq for 
p,q =0,1,2,... defined by 


M N 
Hp = X X (m- 3) (n- g) f(m, n) 2) 


m=l1n=1 


where z, and y are the barycentric coordinates of image (i.e. 
the centroid of the image). These values are computed by 


2o M N 2 
p= u = Xm- n=1 M X f(m, n) and y = FS = 








moo 
M N : page 
é Dm1 n17 X f (m,n), where C is a normalization 


constant given by C = moo = ye R f(m,n). The 


centroid moments Hpq is equivalent to the m,, moment whose 
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center has been shifted to the centroid of the image. Therefore, 
/4pq are invariant to image translations. Scale invariance is ob- 
tained by normalization [6]. The normalized central moments 
Npq are defined for p + q = 2,3,... bY Npq = Hpq/ Udo, With 
y = (p+q+2)/2. Based on these normalized central moments 
Hu in [16] derived seven moment invariants that are unchanged 
under image scaling, translation and rotation as follows 














&, Ê n20 + N02 

2 © (M20 — No2)? + 471 

3 £ (n30 — 3m2)? + (3N21 — 103)? 

B4 £ (n30 + m2)? + (n21 + 103)? 

®5 Ê (nso — 3M2) (n30 + m2) [(n30 + m2)? — 3(m21 + 703)7] 
+ (3n21 — n03) (n21 + nos) [3(n30 + m2)? — (n21 + no3)’] 

Sg = (N20 — no2)[(N30 + n2)? — (mı + nos)" 
+ 4711(30 + m2) (N21 + No3) 

7 Ê (321 — No3) (n30 + M2) [(730 + m2)? — 3(ma1 + no3)’| 
— (n30 — 3m2) (n21 + 703) [3(n30 + m2)” — (nos + 21)" ] 


In this work, we use only the four simplest Hu’s moments to 
compute, that is ® = [®, 2 Pz Gy], to feed the first PNN 
of our sequential MF-ATR method’. 


e SVD features of the target outline 

The SVD is widely applied signal and image processing 
because it is an efficient tool to solve problems with least 
squares method [21]. The SVD theorem states that if Amxn 
with m > n (representing in our context the original binary 
data) is a real matrix®, then it can be written using a so-called 
singular value decomposition of the form 


T 
Amxn = UmxmSmxnV 


nxn 


where Um xm and Vn xn are orthogonal’ matrices. The 
columns of U are the left singular vectors. VT has rows that 
are the right singular vectors. The real matrix S has the same 
dimensions as A and has the form? 


Srxr 


0, (m—-r) 


0, (n—r) 
O(m—r) x(n—-r) 


where Sp xr = Diag{o1,02,...,0r} with o1 > 02,>...> 
Or >Oand 1 < r < min(m,n). 

Calculating the SVD consists of finding the eigenvalues and 
eigenvectors of AAT and ATA. The eigenvectors of ATA 
make up the columns of V, the eigenvectors of AAT make 
up the columns of U. The singular values oj,..., o, are the 
diagonal entries of S,...,. arranged in descending order, and 
they are square roots of eigenvalues from AA? or ATA. 

A method to calculate the set of discrete points 
{a1,@2,...,@,} of a target outline from a binary image 
is proposed in [17]. The SVD features are then computed 


Smxn a 


5It is theoretically possible to work with all seven Hu’s moments in our 
MF-ATR method, but we did not test this yet in our simulations. 

®For a complex matrix A, the singular value decomposition is A = 
USV#, where V¥ is the conjugate transpose of V. 

7They verify Ulam Unam = Imx and Ve en WVnxn =I, xn, where 
Imxm and In xn are respectively the identity matrices of dimensions m x m 
and n x n. 


805%6 is a p X q matrix whose all its elements are zero. 


from the eigenvalues of the circulant matrix built from the 
discretized shape of the outline characterized by the vector 


d = [di,d2,...,d,] where d; is the distance of the centroid 
of the outline to the discrete points a;, i = 1,2,...,n of the 
outline. 


In our analysis, it has been verified from our image 
dataset that only the first components of SVD features vector 
o = [o1,02,...,0,] take important values with respect to 
the other ones. The other components of o tend quickly 
towards zero. Therefore only few first components of ø play 
an important role to characterize the main features of target 
outline. However, if one considers only these few main first 
components of o, one fails to characterize efficiently some 
specific features (details) of the target profile. By doing so, 
one would limit the performances of ATR. That is why we 
propose to use the partial SVDs of outline as explained in the 
next paragraph. 

To capture more details of aircraft outline with SVD, one 
has to taken into account also additional small singular values 
of SVD. This is done with the following procedure issued from 
the face recognition research community [24]. The normalized 
distance vector d = Idi, d2,..., dp] is built from d by 
taking d = [1, d2/d1,...,dn/dı], where dı is the distance 
between the centroid of outline and the first chosen points 
of the contour of the outline obtained by a classical? edge 
detector algorithm. To capture the details of target outline and 
to reduce the computational burden, one works with partial 
SVDs of the original outline by considering only / sliding 
sub-vectors du of d, where w is the number of components 
of dẹ. For example if one takes w = 3 points only in the 


sub-vectors and if d = [d , d2,.. i , do], then one will take 
the sub-vectors dj, = [dı, d2, ds], d2, = [d4, d5, de] and 
d?, = [d7, dg, do] if we don’t use overlapping components 


between sub-vectors. From the sub-vectors, one constructs 
their corresponding circulant matrix and apply their SVD to 
get partial SVD features vectors o!=1, o!=?, etc. The number 
l of partial SVD of the original outline of the target is given 
by | = (n — w)/(w — m) + 1, where m is the number of 
components overlapped by each two adjacent sub-vectors, and 
n is the total number of discrete contour points of the outline 


given by the edge detector. 


B. Step 2: BBA’s construction with PNN’s 


In order to exploit efficiently fusion rules dealing with 
conflicting information modeled by belief mass assignments 
(BBA’s) [18], [23], we need to build BBA’s from all features 
computed from images of the sequence under analysis. The 
construction of the BBA’s needs expert knowledge or knowl- 
edge drawn from training using image dataset. In this paper, 
we propose to utilize probabilistic neural networks (PNN) 
initially developed in nineties by Specht [19] to construct the 
BBA’s because it is a common technique used in the target 
recognition and pattern classification community that is able to 


9In this work, we use the cvcontour function of opencv software [22] to 
extract the target outline from a binary image. 
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achieve with large training dataset performances close to those 
obtained by a human expert in the field. The details of PNN’s 
settings for BBA’s construction are given in [12]. However, 
because the neural network after training to some extent has a 
good discriminant ability (close to an expert in the field), the 
BBA is constructed by the neural network directly based on 
the PNN’s output, which is different from the construction of 
the BBA based on the confusion matrix described in [12]. 

Here we present how the two PNN’s (shown in Figure 1) 
work. In our application, we have Ne = 7 types of aircrafts 
in our training image dataset. For each type, the aircraft is 
observed with N, = 3 poses. Therefore we have Nep = Ne X 
Np = 21 types of distinct cases in our dataset. For each case, 
one has N; = 30 images available for the training. Therefore 
the whole training dataset contains Nepi = N-N,Ni = 7 x 
3 x 30 = 630 binary images. For the first PNN (fed by Hu’s 
features vector), the number of input layer neurons is 4 because 
we use only ® = [®,, ®2, ®3, P4] Hu’s moment invariants in 
this work. For the second PNN (fed by partial SVD features 
vector), the number of input layer neurons is constant and 
equal to 1 x w because we take | windows with the width 
w (so one has w singular values of partial SVD for every 
window). The number of hidden layer neurons of each PNN is 
the number of the training samples, Nepi = 630. The number 
of output layer neurons is equal to Nep = 21 (the number of 
different possible cases). 

Our PNN’s fed by features input vectors (Hu’s moments 
and SVD outline) do not provide a hard decision on the type 
and pose of the observed target under analysis because in our 
belief-based approach we need to build BBA’s. Therefore the 
competition function of the output layer for decision-making 
implemented classically in the PNN scheme is not used in 
the exploitation! phase of our approach. Instead, the PNN 
computes the Nep x N; (Euclidean) distances between the 
features vectors of the image under test and the Nepi = 630 
features vectors of the training dataset. A Gaussian radial 
basis function (G-RBF) is used in the hidden layer of the 
PNN’s [19] to transform its input (Euclidean) distance vector 
of size 1 x Nepi into another 1 x Nepi distance (similarity) 
that feeds the output layer through a weighting matrix of size 
Nepi X Nep = 630x 21 estimated from the training samples. As 
a final output of each PNN, we get an unnormalized similarity 
vector m of size (1x Nepi) X (Nepi X Nep) = 1X Nep = 1X21 
which is then normalized to get a Bayesian BBA on the frame 
of discernment © = {(target;,pose;),i = 1,...,¢j = 
1,...,p}. Because we use only two!! PNN’s in this approach, 
we are able to build two Bayesian BBA’s m(.) and ma(.) 
defined on the same frame O for every image of the sequence 
to analyze. 


C. Step 3: Fusion of BBA’s and local decision 


A basic belief assignment (BBA), also called a (belief) mass 
function, is a mapping m/(.) : 2° + [0; 1] such that m(0) = 0 


10Wwhen analyzing a new sequence of an unknown observed aircraft. 
'lA first PPN fed by Hu’s features, and a second PNN fed by SVD outline 
features — see Fig. 1. 


and ` yee M(X) = 1, where © is the so-called frame of 
discernment of the problem under concern which consists of 
a finite discrete set of exhaustive and exclusive hypotheses!” 
0i, i =1,...,n, and where 2° is the power-set of © (the set of 
all subsets of ©). This definition of BBA has been introduced 
in Dempster-Shafer Theory (DST) [23]. The focal elements 
of a BBA are all elements X of 2° such that m(X) > 0. 
Bayesian BBA’s are special BBA’s having only singletons (i.e. 
the elements of ©) as focal elements. 


In DST, the combination of BBA’s is done by Dempster’s 
rule of combination [23] which corresponds to the normalized 
conjunctive consensus operator. Because this fusion rule is 
known to be not so efficient (both in highly and also in low 
conflicting) in some practical situations [25], many alternative 
rules have been proposed during last decades [18], Vol. 2. 


To overcome the practical limitations of Shafers’ model 
and in order to deal with fuzzy hypotheses of the frame, 
Dezert and Smarandache have proposed the possibility to 
work with BBA’s defined on Dedekind’s lattice!? DO [18] 
(Vol.1) so that intersections (conjunctions) of elements of the 
frame can be allowed in the fusion process, with eventually 
some given restrictions (integrity constraints). Dezert and 
Smarandache have also proposed several rules of combination 
based on different Proportional Conflict Redistribution (PCR) 
principles. Among these new rules, the PCR5 and PCR6 rules 
play a major role because they do not degrade the specificity of 
the fusion result (contrariwise to most other alternative rule), 
and they preserve the neutrality of the vacuous BBA!*. PCR5 
and PCR6 provide same combined BBA when combining 
only two BBA’s mı(.) and mə(.), but they differ when 
combining three (or more) BBA’s altogether. It has been 
recently proved in [26] that PCR6 is consistent with empirical 
(frequentist) estimation of probability measure, unlike other 
fusion rules!>.These two major differences with DST, make 
the basis of Dezert-Smarandache Theory (DSmT) [18]. 


In the context of this work, we propose to use PCRS to 
combine the two (Bayesian) BBA’s mı (.) and ma(.) built from 
the two PNN’s fed by Hu’s features vector and SVD outline 
features vector. Because for each image of the observed target 
in the sequence, one has only two BBA’s to combine, the PCR5 
fusion result is same as the PCR6 fusion result. Of course, 
if one wants to include other kinds of features vectors with 
additional PNN’s, the PCR6 fusion rule is recommended. The 
PCR principle consists in redistributing the partial conflicting 
masses!°only to the sets involved in the conflict and propor- 
tionally to their mass. The PCRS (or PCR6) combination of 


This is what is called Shafer’s model of the frame in the literature. 

'3Dedekind’s lattice is the set of all composite subsets built from elements 
of © with U and N operators. 

14A vacuous BBA is the BBA such that m(O) = 1. 

'Sexcept the averaging rule. 

'6For two BBA’s, a partial conflicting mass is a product m1(X)m2(Y) > 
0 of the element X N Y which is conflicting, that is such that X NY = Q. 
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two BBA’s is done according to the following formula!’ [18] 


mpcrs/6(X) = > mi(X1)m2(X2)+ 





` m1(X)?m2(Y) m(X)?m1(Y) 1 8) 
yer {x} m(X)+ma(Y)  mo(X)+mi(Y) 
XNY=0 


where all denominators in (3) are different from zero, and 
mpcrs/6(0) = 0. If a denominator is zero, that fraction 
is discarded. All propositions/sets are in a canonical form. 
Because we work here only with Bayesian BBA’s, the previous 
fusion formula is in fact rather easy to implement, see [18] 
(Vol. 2, Chap. 4). 

In summary, the target features extraction in a sequence of 
K images allows us to generate, after Step 3, a set of BBA’s 
{mimager(.) k = 1,2,...,K}. Every BBA m/™29¢«(,) is 
obtained by the PCR5/6 fusion of BBA’s mi'“9°*(.) and 
m9 (.) built from the outputs of two PNN’s. From this 
combined BBA, a local! decision O; can be drawn about 
the target type and target pose in Image, by taking the focal 
element of m/29°«(.) having the maximum mass of belief. 


D. Step 4: Hidden Markov Model (HMM) for recognition 


Usually (and specially in military context), the posture of 
an aircraft can continuously change a lot during its flightpath 
making target recognition based only on single observation 
(image) very difficult, because some ambiguities can occur 
between extracted features with those stored in the training 
image data set. To improve the target recognition performance 
and robustness, one proposes to use the sequence of target 
recognition decision O; drawn from BBA’s {m!™29°« (.),k = 
1,2,..., K} to feed HMM classifiers in parallel. We suggest 
this approach because the use of HMM has already been 
proved to be very efficient in speech recognition, natural 
language and face recognition. We briefly present HMM, and 
then we will explain how HMMs are used for automatic 
aircraft recognition. 

Let us consider a dynamical system with a finite set of pos- 
sible states S = {s1,52,...,5y}. The state transitions of the 
system is modeled by a first order Markov chain governed by 
the transition probabilities given by P(s(t,) = sj|s(tk-1) = 
Si, S(tk—2) = Sk,-- ) = P(s(tk) = sj|s(tk—1) = si) = 
where s(t) is the random state of the system at time t. A 
HMM is a doubly stochastic processes including an underlying 
stochastic process (i.e. a Markov chain for modeling the state 
transitions of the system), and a second stochastic process 
for modeling the observation of the system (which is a 
function of the random states of the system). A HMM, denoted 
à = (A,B, II), is fully characterized by the knowledge of the 
following parameters 


Qij, 


'7Here we assume that Shafers’ model holds. The notation Mm pcRs /6 
means PCRS and PCR6 are equivalent when combining two BBA’s. 

'8because it is based only on a single image of the unknown observed target 
in the sequence under analysis. 


1) The number N of possible states S = {s1, S2,... 
of the Markov chain. 

2) The state transition probability matrix!? A = [aij] of 
size N x N, where aj; = P(s(tx) = siļs(tk-1) = sj). 

3) The prior mass function (pmf) II of the initial state of 
the chain, that is I = {m,..., ny} with DX; m; = 1, 
where m; = P(s(t1) = si). 

4) The number M of possible values V = {v,.. 
taken by the observation of the system. 

5) The conditional pmfs of observed values given the states 
of the system characterized by the matrix B = [bmi] of 
size M x N, with bmi & P(Ok = Umls(tk) = si), 
where Ox is the observation of the system (i.e. the local 
decision on target type with its pose) at time fy. 


,sn} 


.,um} 


In this work we consider a set of Ne HMMs in parallel, 

where each HMM is associated with a given type of target 
to recognize. We consider the following state and observation 
models in our HMMs: 
- State model: For a given type of aircraft, we consider a 
finite set of distinct aircraft postures available in our training 
image dataset. In our application, we consider only three states 
corresponding to sı = top view, S2 = side view and s3 = 
front view as shown (for a particular aircraft) in Figure 2. 





Fig. 2: Example of HMM states. 


- Observation model: In our HMMs, we assume that each 
state (posture) of aircraft is observable. Since we have 
only Np = 3 states S = {s1, 52,53} for each aircraft, 
and we have Ne = 7 types of aircrafts in the training 
dataset, we have to deal with Nep = 3 x 7 = 21 possible”? 
observations (local decisions) at each time t. As explained 
previously, at the end of Step 3 we have a set of BBA’s 
{m meser (.) k = 1,2, ..., K} that helps to draw the sequence 
of local decisions OX 4 {0,,...,Ox,...,Ox}. This 
sequence of decisions (called also recognition observations) 
is used to evaluate the likelihood P(O*|);) of the different 
HMMs described by the parameter 4; = (Aj, B;,II;), 
i = 1,2,..., Ne. The computation of these likelihoods will 
be detailed at the end of this section. The final decision 
for ATR consists to infer the true target type based on 
the maximum likelihood criterion. More precisely, one will 
decide that the target type is i* if i* = arg max; P(O*|);). 


e Estimation of HMM parameters 

To make recognition with HMMs, we need at first to define 
a HMM for each type of target one wants to recognize. 
More precisely, we need to estimate the parameters \; = 


19We assume that the transition matrix is known and time-invariant, i.e. all 
elements aij do not depend on tk—ı and tp. 

20We assume that the unknown observed target type belongs to the set of 
types of the dataset, as well as its pose. 


445 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


(A;, B;, Il;), where i = 1,...,N. is the target type in the 
training dataset. The estimation of HMM parameters is done 
from observation sequences drawn from the training dataset 
with Baum-Welch algorithm [20] that must be initialized with 
a chosen value A? = (A°, B}, II). This initial value is chosen 
as follows: 

1) — State prior probabilities II? for a target of type i: For each 
HMM, we consider only three distinct postures (states) 51, S2 
and s3 for the aircraft. We use a uniform prior probability 
mass distribution for all types of targets. Therefore, we take 
I? = [1/3,1/3,1/3] for any target type i = 1,...,Ne to 
recognize. 

2) — State transition matrix A? of a target of type i: The 
components apq of the state transition matrix A? are estimated 
from the analysis of many sequences?! of target i as follows 


jai 9(s(te)» Sp) x 5(s(te+1): sa) 
pai 9(s(tk), sp) 

where Np is the number of states of the Markov chain, 
ô(x, y) is the Kronecker delta function defined by 6(x, y) = 1 
if y = x, and 6(x,y) = O otherwise, and where K is 
the number of images in the sequence of target 7 avail- 
able in the training phase. For example, if in the train- 
ing phase and for a target of type i = 1, we have the 
following sequence of (target type, pose) cases given by 
(1,1), (1,1), (1,2), (1,1), (1,3), (1,1), (1, 1], then from Eq. 
(4) with K = 7, we get? 


(4) 





pa 


2/4 1/4 1/4 
A?%;=|1 0 0 
1 0 0 


3) — Observation matrix B? for a target of type i: The 
initial observation matrix B? is given by the confusion matrix 
learnt from all images of the training dataset. More precisely, 
from every image of the training dataset, we extract Hu’s 
features and partial SVD outline features and we feed each 
PNN to get two BBA’s according to Steps 1-3. From the 
combined BBA, we make the local decision (tar get;, pose;) if 
m((target;, pose;)) is bigger than all other masses of belief 
of the BBA. This procedure is applied to all images in the 
training dataset. By doing so, we can estimate empirically 
the probabilities to decide (target;,pose;) when real case 
(target, , pose; ) occurs. So we have an estimation of all com- 
ponents of the global confusion matrix B° = [P(decision = 
(target;,pose;) | reality = (target;,pose;))|. From B° 
we extract the c sub-matrices (conditional confusion matrices) 
B9, i =1,..., Ne by taking all the rows of B® corresponding 
to the target of type i. In our application, one has Ne = 7 
types and N, = 3 postures (states) for each target type, hence 
one has Nep = 7 x 3 = 21 possibles observations. Therefore 
the global confusion matrix B° has size 21 x 21 is the stack 
of Ne = 7 sub-matrices B9, i = 1,..., Ne, each of size 
Np X Nep = 3 X 21. 


2lThe video stream of different (known) aircraft flights generate the 
sequences of images to estimate approximately apq 
22One verifies that the probabilities of each raw of this matrix sum to 1. 


e Exploitation of HMM for ATR 

Given a sequence O* of K local decisions drawn from the 
sequence of K images, and given Ne HMMs characterized by 
their parameter A; (i = 1,..., Nc), one has to compute all the 
likelihoods P(O* |);), and then infer from them the true target 
type based on the maximum likelihood criterion which is done 
by deciding the target type 7* if i* = arg max; P(O |A;). The 
computation of P(O*|,;) is done as follows [20]: 

e generation of all possible state sequences of length 
K, Se = [sı(tı)sı(t2) . .. si(tK)], where 8i(th) E S 
(k=1,..., K) and l = 1,2,...,|S|* 

e computation of P(O*|\;) by applying the total proba- 
bility theorem as follows ? 


P(Sf* Ai) = Ms)(t1) ° 4s; (t1)81(t2) “++ + Osi (tK-1)sı(tx) (5) 


P(O* |X, SE) = Dsi(t1)O1 Ds, (t2)02 te Osi (te)OK (6) 


|s|* 
P(O™|\;) = XO P(O™|di, SE) P(SK IA) O 
l=1 
III. SIMULATIONS RESULTS 


For the simulations of SMF-ATR method, we have used 
Ne = 7 types of aircrafts in the training image dataset. Each 
image of the sequence has 1200 x 702 pixels. The sequences 
of aircraft observations in the training dataset take 150 frames. 
The Np = 3 poses of every aircraft is shown in Fig. 3. 
For evaluating our approach, we have used sequences (test 
samples) of images of 7 different aircraft, more precisely 
the Lockheed-F22, Junkers-G.38ce, Tupolev ANT 20 Maxime 
Gorky, Caspian Sea Monster (Kaspian Monster), Mirage-F1, 
Piaggio P180, and Lockheed-Vega, flying under conditions that 
generate a lot of state (posture) changes in the images. The 
number of the images in each sequence to test varies from 
400 to 500. The shaping parameter of the G-RBF of PNN’s 
has been set to 0.1. The simulation is done in two phases: 1) 
the training phase (for training PNN’s and estimating HMM’s 
parameters), and 2) the exploitation phase for testing the real 
performances of the SMF-ATR with test sequences. 

A - Performances evaluation 


In our simulations, we have tested SMF-ATR with two 
different fusion rules: 1) the PCR5 rule (see Section II-C), 
and 2) Dempster-Shafer (DS) rule’ [23]. The percentages of 
successful recognition (i.e. the recognition rate R;) obtained 
with these two SMF-ATR methods are shown in Table I for 
each type i = 1,2,..., Ne of aircraft. The performances of 
these SMF-ATR versions are globally very good since one 
is able to recognize with a minimum of 85.2% of success 
the types of aircraft included in the image sequences under 
test when using DS-based SMF-ATR, and with a minimum of 


23The index i of components of A; and B; matrices has been omitted for 
notation convenience in the last two formulas. 

*4Because Dempster’s rule is one of the basis of Dempster-Shafer Theory, 
we call prefer to call it Dempster-Shafer rule, or just DS rule. This rule 
coincides here with Bayesian fusion rule because we combine two Bayesian 
BBA’s and we don’t use informative priors. 
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Fig. 3: Poses of different types of aircrafts. 


93.5% of success with the PCR5-based SMF-ATR. In term of 
computational time, it takes between Sms and 6ms to process 
each image in the sequence with no particular optimization 
of our simulation code, which indicates that this SMF-ATR 
approach is close to meet the requirement for real-time aircraft 
recognition. It can be observed that PCR5-based SMF-ATR 
outperforms DS-based SMF-ATR for 3 types of aircraft and 
gives similar recognition rate as with DS-based SMF-ATR for 
other types. So PCR5-based SMF-ATR is globally better than 
DS-based SMF-ATR for our application. 








Target type 1 2 3 4 5 6 7 
Ri (PCRS rule) | 95.7 | 93.5 | 96.3 | 98.2 | 96.3 | 98.5 | 97.3 
Ri (DS rule) 95.7 | 93.5 | 85.2 | 97.8 | 96.3 | 98.5 | 97.2 
































TABLE I: Aircraft recognition rates R; (in %). 


B - Robustness of SMF-ATR to image scaling 


To evaluate the robustness of (PCR5-based) SMF-ATR ap- 
proach to image scaling effects, we did apply scaling changes 
(zoom out) of ZO = 1/2, ZO = 1/4 and ZO = 1/8 in the 
images of the sequences under test. The performances of the 
SMF-ATR are shown in Table II. One sees that the degradation 
of recognition performance of SMF-ATR due to scaling effects 
is very limited since even with a 1/8 zoom out one gets 90% 
of successful target recognition. The performance will decline 
sharply if the targets zoom out goes beyond 1/16. 

C - Robustness to compound type 


Table III gives the performances of SMF-ATR on sequences 
with two types of targets (475 images with type 1, and 382 
images with type 2). 

The two left columns of Table HI show the performances 








Target type 1 2 3 4 5 6 7 

Ri (no ZO) 95.7 | 93.5 | 96.3 | 98.2 | 96.3 | 98.5 | 97.3 
Ri (ZO=1/2) | 95.0 | 92.0 | 95.2 | 94.7 | 96.1 | 96.6 | 95.4 
Ri (ZO=1/4) | 95.0 | 92.0 | 94.7 | 91.7 | 93.6 | 91.6 | 95.7 
Ri (ZO=1/8) | 95.0 | 92.2 | 93.1 | 89.3 | 93.6 | 94.5 | 90.7 
































TABLE II: Aircraft recognition rates R; (in %) of (PCR5/6- 
based) SMF-ATR with different zoom out values. 








Aircraft Single Single | Compound 
Type 1 | Type 2 Type 
Ri (SMF-ATR) | 96.3 % | 98.5% 97.3% 




















TABLE III: Robustness to target compound. 


obtained when recognizing each type separately in each sub- 
sequence. The last column shows the performance when 
recognizing the compound type Type 1 U Type 2. One sees 
that the performance obtained with compound type (97.3%) is 
close to the weighted average? 97.5% recognition rate. This 
indicates that no wide range of recognition errors occurs when 
the targets type change during the recognition process, making 
SMF-ATR robust to target type switch. 


D - Performances with and without HMMs 


We have also compared the performances of SMF-ATR, 
with two methods using more features but which do not exploit 
sequences of images with HMM. More precisely, the recogni- 
tion is done locally from the combined BBA for every image 
without temporal integration processing based on HMM. We 
call these two Multiple Features Fusion methods MFF1 and 
MFF2 respectively. In MMF1, one uses Hu’s moments, NMI 
(Normalized Moment of Inertia), affine invariant moments, and 
SVD of outline, PNN and PCRS fusion, whereas MMF2 uses 
same features as MMF1 but with BP network as classifier 
and DS rule of combination. The recognition performances are 
shown in Table IV. One sees clearly the advantage to use the 
image sequence processing with HMMs because of significant 
improvement of ATR performances. The recognition rate of 
MFF2 declines seriously because the convergence of the BP 
network is not good enough. 
































Target type 1 2 3 4 5 6 7 

Ri (SMF-ATR) | 95.7 | 93.5 | 96.3 | 98.2 | 96.3 | 98.5 | 97.3 
Ri (MFF1) 89.2 | 92.0 | 91.2 | 86.9 | 92.2 | 93.5 | 95.0 
Ri (MFF2) 64.9 | 51.6 | 82.8 | 82.2 | 70.8 | 48.3 | 58.9 








TABLE IV: Performances (in %) with and without HMMs. 


E - SMF-ATR versus SSF-ATR 


We have also compared in Table V the performances 
SMF-ATR with those of two simple SSF-ATR2° methods, 
called SSF1-ATR and SSF2-ATR. The SSF1-ATR uses only 
Hu’s moments features whereas SSF2-ATR uses only SVD 
of outline as features. SSFI-ATR exploits image sequence 
information using BP networks as classifier and DS rule for 
combination, while SSF2-ATR uses PNN and PCR5/6 rule. 


25 According to the proportion of the two types in the whole sequence. 
26SSF-ATR stands for Single-feature Sequence Automatic Target Recogni- 
tion. 
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Target type 1 2 3 4 5 6 7 

Ri (SMF-ATR) | 95.7 | 93.5 | 96.3 | 98.2 | 96.3 | 98.5 | 97.3 
Ri (SFFI-ATR) | 39.3 | 42.3 | 74.3 | 56.7 | 60.1 | 33.9 | 44.3 
Ri (SFF2-ATR) | 88.8 | 66.4 | 86.7 | 66.9 | 73.6 | 52.9 | 63.8 








TABLE V: Performances (in %) of SMF-ATR and SFF-ATR. 


One clearly sees the serious advantage of SMF-ATR with 
respect to SFF-ATR due to the combination of information 
drawn from both kinds of features (Hu’s and SVD of outline) 
extracted from the images. 


IV. CONCLUSIONS AND PERSPECTIVES 


A new SMF-ATR approach based on features extraction has 
been proposed. The extracted features from binary images feed 
PNNs for building basic belief assignments that are combined 
with DSmT PCR rule to make a local (based on one image 
only) decision on target type. The set of local decisions ac- 
quired over time for the image sequence feeds HMMs to make 
the final recognition of the target. The evaluation of this new 
SMF-ATR approach has been done with realistic sequences 
of aircraft observations. SMF-ATR is able to achieve higher 
recognition rates than classical approaches that do not exploit 
HMMs, or SSF-ATR. Another complementary analysis of the 
robustness of SMF-ATR to target occultation is currently under 
progress and will be published in a forthcoming paper. Our 
very preliminary results based only on few sequences indi- 
cate that SMF-ATR seems very robust to target occultations 
occurring randomly in single (non consecutive) images, but a 
finer analysis based on Monte-Carlo simulation will be done 
to evaluate quantitatively its robustness in different conditions 
(number of consecutive occultations in the sequences, the level 
of occultation, etc). As interesting perspectives, we want to 
extend SMF-ATR approach for detecting new target types that 
are not included in image data set. Also, we would want to 
deal with the recognition of multiple crossing targets observed 
in a same image sequence. 
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Abstract 


Information fusion includes signals, features, and decision-level analysis over 
various types of data including imagery, text, and cyber security detection. With the 
maturity of data processing, the explosion of big data, and the need for user acceptance; 
the Dynamic Data-Driven Application System (DDDAS) philosophy fosters insights 
into the usability of information systems solutions. In this paper, we explore a notion of 
an adaptive adjustment of secure communication trust analysis that seeks a balance 
between standard static solutions versus dynamic-data driven updates. A use case is 
provided in determining trust for a cyber security scenario exploring comparisons of 
Bayesian versus evidential reasoning for dynamic security detection updates. Using the 
evidential reasoning proportional conflict redistribution (PCR) method, we demonstrate 
improved trust for dynamically changing detections of denial of service attacks. 


1 Introduction 


Information fusion (Blasch, et al., 2012) has a well-documented following of different methods, 
processes, and techniques emerging from control, probability, and communication theories. 
Information fusion systems designs require methods for big data analysis, secure communications, 
and support to end users. Current information fusion systems use probability, estimation, and signal 
processing. Extending theses techniques to operational needs requires an assessment of some of the 
fundamental assumptions such as secure communications over various data, applications, and 
systems. Specifically, the key focus of this paper is based on the question of measuring trust in static 
versus dynamic information fusion systems. 

Static versus dynamic information fusion comes from three perspectives such as data, models, and 
processing. As related to information fusion techniques, many studies exist on centralized versus 
distributed processing, single versus multiple models, and stovepipe versus multi-modal data. In each 
case, static information fusion rests in centralized processing from single model estimation over a 
single source of data. On the other extreme is distributed processing, using multiple-models over 
multi-modal data; which in reality is supposed to cover the entire gamut of big data solutions captured 
in large-scale systems designs. In reality, with such an ambitious goal, there are always fundamental 
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assumptions that tailor the system design to the user needs. For example, a system could be designed 
to capture all image data being collected from surveillance sensors; however filtering collections over 
a specific area, for a designated time internal, at a given frequency helps to refine answers to user 
requests. Thus, as a user selects the details of importance, responses should be accessible, complete, 
and trustworthy. 

Dynamic information fusion is a key analysis of the paper of which we focus on trust. If a machine 
is processing all the data, then time and usability constraints cannot be satisfied. Thus, either the user 
or the machine must determine the appropriate set of data, models, and processing that is needed for a 
specific application. Trust analysis is required to determine security and reliability constraints, and 
DDDAS provides a fresh look at the balance between static and dynamic information fusion. In this 
paper, we explore the notions of dynamic information fusion towards decision making as cyber 
detections change. 

In Section 2 we overview information fusion and DDDAS. Section 3 discusses the notions of trust 
as a means to balance between information fusion and dynamic data detections. Section 4 compares 
Bayesian versus evidential reasoning. Section 5 provides a use-case for analysis for cyber trust and 
Section 6 provides conclusions.. 


2 Information Fusion and DDDAS 


Information fusion and DDDAS overlap in many areas such as data measurements, statistical 
reasoning, and software development for various applications. Recently, there is an interest in both 
communities to address big data, software structures, and user applications. The intersection of these 
areas includes methods of information management (Blasch, 2006) in assessing trust in data access, 
dynamic processing, and distribution for applications-based end users. 


2.1 Information Fusion 


The Data Fusion Information Group (DFIG) model, shown in Figure 1, provides the various 
attributes of an information fusion systems design. Information fusion concepts are divided between 
Low-level Information Fusion (LLIF) and High-level Information Fusion (HLIF) (Blasch, et al., 
2012). LLIF (L0-1) composes data registration (Level 0 [L0]) and explicit object assessment (L1) 
such as an aircraft location and identity (Yang, 2009). HLIF (L2-6) composes much of the open 
discussions in the last decade. The levels, to denote processing, include situation (L2) and impact (L3) 
assessment with resource (L4), user (L5) (Blasch, 2002), and mission (L6) refinement (Blasch, 2005). 
Here we focus on Level 5 fusion by addressing cyber security trust in systems design. 
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Figure 1. DFIG Information Fusion model (L = Information Fusion Level). 
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Data access for information fusion requires an information management (IM) model of the enterprise 
architecture, as shown in Figure 2. The IM model illustrates the coordination and flow of data through 
the enterprise with the various layers (Blasch, et al., 2012). 


People or autonomous agents interact with the managed information enterprise environment by 
producing and consuming information. Various actors and their activities/services within an IM 
enterprise surround the IM model that transforms data into information. Within the IM model, there 
are various services that are needed to process the managed information objects (MIOs). Security is 
the first level of interaction between users and data. 

MANAGERS Operating Environmentand Mission Roles 


Control Access and Audit Logs SECURITY Sanitize Cross-Domain Content 


FEDERATES 





Meta Data Standards 


= 
5 
: 
Š 
2 


DATA PRODUCTS 


Instantiate and Maintain Workflow 
Log Transactions 
Producers Formats and Standards Consumers 


Figure 2. Information Management (IM) Model. 


A set of service layers are defined that use artifacts to perform specific services. An artifact is a 
piece of information that is acted upon by a service or that influences the behavior of the service (e.g., 
a policy). The service layers defined by the model are: Security, Workflow, Quality of Service (QoS), 
Transformation, Brokerage, and Maintenance. These services are intelligent agents that utilize the 
information space within the architecture, such as cloud computing and machine analytics. Access to 
the data requires secure communications which is dynamic, data-type driven, and application specific. 


2.2 Dynamic Data Driven Application Systems (DDDAS) 


DDDAS is focused on applications modeling (scenarios), mathematical and statistical algorithms 
(theory), measurement systems, and systems software as shown in Figure 3. For a systems application, 
user mission needs drive data access over the scenarios. The available data is processed from 
measurements to information using theoretical principles. The data-driven results are presented to the 
user through visualizations; however the trust in the data is compounded by data quality, the model 
fidelity, and systems availability of which software is an integral part to a systems application. 
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Figure 3. DDDAS Aligned with Information Fusion. 


Using a cyber example for DDDAS, the application is secure data communications to meet 
mission needs (L6). While not a one-to-one mapping, it can be assumed that data management, driven 
by scenarios, identifies cyber threat attacks (L3) such as denial of service attacks. The theory and 
measurements come from the models of normal behavior (L1) which use computational methods to 
support cyber situation awareness (L2) visualization. The user (L5) interacts with the machine through 
data management (L4), as new measurements arrive. Current research seeks distributed, faster, and 
more reliable communication systems to enable such processing and coordination between the man 
and their machines, however, measurement of trust is paramount. 


3 Trust in Information Processing 


Several theories and working models of trust in automation have been proposed. Information 
which is presented for decision-aiding is not uniformly trusted and incorporated into situation 
awareness. Three proposed increasing levels, or ‘stages of trust’, for human-human interactions 
include: Predictability, Dependability, and Faith (Rempel, et al., 1985). Participants progress through 
these stages over time in a relationship. The same was anticipated in human-automation interactions, 
either via training or experience. The main idea is that as trust develops, people will make decisions 
based upon the trust that the system will continue to behave in new situations as it has demonstrated in 
the past. Building upon Rempel’s stages, (Muir & Moray, 1996) postulated that 


Trust = Predictability + Dependability + Faith + Competence + Responsibility + Reliability 


and further defined the construct of Distrust: which (1) can be caused by operator feeling that the 
automation is undependable, unreliable, unpredictable, etc. and a (2) set of dimensions related to 
automation failures, which may cause distrust in automated systems (location of failure, causes of 
failure or corruption, time patterns of failure). 


Table 1, adapted below from (Muir & Moray, 1996), depicts the quadrant of trust and distrust 
behaviors with respect to good or poor quality of the automation. Basically, the outcome of a wrong 
decision to trust the automation is worse than the outcome of a wrong decision to not trust the 
automation. Hence, security is enforced to not trust a poor decision. 
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Operator’s trust & Quality of the automation 
allocation of function ‘Good’ ‘Poor’ 


Trusts and uses the Appropriate Trust (optimize False Trust (risk automated disaster) 
automation system performance) 


Distrusts and rejects the False Distrust (lose benefits of Appropriate Distrust (optimize 
automation automation, inc. workload) system performance) 


Table 1: Trust, Distrust, and Mistrust, (adapted from Muir and Moray, 1996) 





Trust in the automation clearly impacts a user mental model of secure communications. Therefore, 
dynamic models must be devised to account for different levels of attention, trust, and interactions in 
Human in the Loop (HIL) and Human on the Loop (HOL) designs. A user must be given permission 
to refine the assessment for final decision for validity and reliability of the information presented. 
User Trust issues then are confidence (correct detection), security (impacts), integrity (what you 
know), dependability (timely), reliable (accurate), controllability, familiar (practice and training), and 
consistent (reliable). 

Trust in information processing involves many issues; however, here we focus on the development 
of a cyber domain trust stack as shown in Figure 4. The trust stack composes policies, trust authority, 
collecting raw metrics and behavior analysis, leading to authentication and authorization, and then 
secure communications. Similar to the information management model, polices are important to 
determine whether data access is available. Likewise, sensor management gets access to raw metrics 
(Blasch, 2004) that need to be analyzed for situation awareness. The problem not being full addressed 
is the impeding results for secure communications. In what follows, we discuss the main functions to 
be provided by each layer in the trust stack shown in Figure 4. 


Polices Enforcement 
Domain Trust Enforcement 
‘Measurements uation Awareness) 
Authentication and Authorization 
Secure Communication 


Figure 4. Trust Stack. 

















3.1 Secure Communications, Authentication, and Authorization 


Secure communications is an important property to guarantee the confidentiality and integrity of 
the messages used to evaluate trust in the system. Certificates are used to verify the identify of 
communicating end-devices (Kaliski, 1993). The communication channel is encrypted using DES 
(Data Encryption Standard, 2010) in CFB64 (Cipher Feedback) mode. In this CFB mode, the first 8 
bytes of the key generated used to encrypt the first block of data. This encrypted data is then used as a 
key for the second block. This process is repeated until the last block is encrypted. The DES is still 
used in legacy virtual private networks (VPNs) and could benefit from a DDDAS trust analysis even 
used with multiple protocol authentication systems such as Kerberos. 

Multiple protocols have been developed over the years for password-based authentication, 
biometric authentication, and remote user authentication. In order to evaluate the trust of different 
entities with many users, multiple systems, and multiple domains, we assume the use of remote user 
authentication. Remote Authentication Dial-In User Service (RADIUS) (Willens, et al, 2000) is a 
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famous client/server protocol to allow remote entities to communicate with a server to authenticate 
remote users. RADIUS gives organization ability to maintain user profiles in a specific database that 
the remote servers share. 

The Domain Trust Enforcement (DTE) agent performs the authorization process for the end-to- 
end adaptive trust. Based on the results of the authentication process and the received trust level, the 
DTE agent grants or denies authorization to access the resources, i.e., allow or deny the 
communication between the different entities. 


3.2 Collecting Raw Measurements 


Much software, both commercial and open source, are available and provide important health and 
security information, such as Nagios (Nass, 2009). This information can be used to extract metrics 
that can be used to evaluate the trust of different entities. These metrics can be divided into multiple 
categories based on their source: User, Application, Machine, Connection, or Security Software 
Alerts. In order to evaluate the trust, the metrics need to be quantified and normalized (e.g., between 0 
and 1) to a common scale. Table 2 shows a set of measured metrics and their quantification function 
and Figure 5 shows these categories with some example metrics. 






































Category Metric Quantification 
[eer Password Length<8 
Password Length 
User Password Strength 0.1409: : 8 , Otherwise 
Maximum Password Length 
. 0, #days>Maximum Number Of Days 
User ao since last password #days em 
GA ~ Maximum Number of Days’ laa 
a 0, #failures>Maximum Number Of Allowed Failures 
Number of authentication 3 
User failures #failures Otherto 
~ Maximum Number Of Allowed Failures’ seas 
0, #Lock Outs>Maximum Number Of Allowed Lock Outs 
User Lock Outs r #Lock Outs pee 
Maximum Number Of Allowed Lock Outs’ 





Wee ; Reputation 
Application Developer Reputation 





Maximum Reputation 





0.5, Local Administrator 
0, No Administrator 


1, Global Adminstrator 
Application Who manages the software | 





#Hops 
Maximum Number of Hops 


ti Number of h 
Connection umber of hops , Otherwise 





A #Hops>Maximum Number Of Hops 





0, #Discarded Packet>Maximum #Discarded Packet 


Connection Number of discarded Packets #Discarded Packet 





, Otherwise 





1- 
Maximum #Discarded Packet 
1, Up to date 


Machine Firmware version 5, 1 Version Behind 





1, No Shared Folders 
0.5, Shared User Folders 
0, Shared System Folders 


Machine Shared Folders 





1, No Probelm 
0.5, Problem in user data 
0, Problem in system integrity 


Analyzer Integrity Check 


0 
k Otherwise 








1, No Alert 

0.5, Virus Found in a document 
0.25, Virus Found in an executable 
0, worm found 


Analyzer Virus Alerts 











Table 2: Examples of metric quantification 
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3.3 Behavior Analysis 


Behavior analysis techniques apply statistical and data mining techniques to determine the current 
operating zone of the execution environment (situation awareness) and also project its behavior in the 
near future. The operating point (OP) of an environment can be defined as a point in an n-dimensional 
space with respect to well-defined attributes. An acceptable operating zone can be defined by 
combining the normal operating values for each attribute. At runtime, the operating point moves from 
one zone to another and that point might move to a zone where the environment does not meet its trust 
and security requirements. We use these movements in the OP to adjust the trust value of the current 
environment as will be discussed in further detail in the Domain Trust Authority section. By 
continuously performing behavior analysis of the environment, we can then proactively predict and 
detect the anomalous behaviors that might have been caused by malicious attacks. Furthermore, once 
it is determined that the environment’s operating point is moving outside the normal zone, it will 
adopt its trust value and then determine the appropriate proactive management techniques that can 
bring back the environment situation to a normal operating zone. 

Location 
User Has Password 


Firmwares Version 
Password Strength 


OS Version 
Days Since Last Password Change 
Services Versions 
Passed Time Since Last Login 
OS/Firmware/Services User 
are updated or no Number Of Authentication Failures 
Available Disk Space Successful Logins and Logouts 
Shared Folders Machine Incorrect Logins 
Modification or Addition Lock outs 


of Administrator Accounts 


/ P Has Digital Signature or Not 
Change in Audit Policies 


Developer Reputation 
Guest Account Enabled or Not 
Who manage the software 


Security Softwares Application 
installed and Enabled Who installed it 
Attached interfacing Devices Updated or not 
Antivirus Previously Performed Memory Violation or Not 
Data Execution Prevention (DEP) Number of Hops 
Behavior Analysis Security Software Alerts Connection Location of the Peer 
Firewall Number of Discarded 


or Error Packets 
Vulnerability Analysis 


Figure 5. Trust Metrics. 


3.4 Domain Trust Authority 


DTA evaluates the end-to-end trust over secure communications. It defines a tuple (machine, 
application, user, data) to be an entity and all communications among entities has a certain context. 
Thus authentication is conducted per entity. Every entity has a trust level associated with it. In order 
to measure the trust, trust’s metrics are introduced, and they take values between 0 and 1. Where 0 
represents the distrust and 1 represent the blind or full trust. The trust measurements for all entities are 
stored in an entity call Trust Authority. The NIST standard SP 800-53 (NIST, 2010) is used and it 
defines four levels of trust: 





Level Distrust Low Trust Moderate High Trust 
Trust Value 0.00 0.33 0.66 1.00 
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Initially, a risk and impact analysis is performed to quantify the impact of each component on the 
overall operations of the network. Common Vulnerabilities and Exposures (CVE) and Common 
Vulnerability Scoring System (CVSS) are used to evaluate the initial impact for both software and the 
environment, and reputations of the users are used to assign their initial impacts. Based on the initial 
impact analysis, the initial trust values for each entity is determined. The risk and impact analysis 
performed is in consistence with the NIST “Recommended Security Controls for Federal Information 
Systems and Organizations” report. According to the NIST report, risk measures the extent to which 
entities are threatened by circumstances or events. The risk is a function of impact and its probability 
of occurrence. Risks arise from the loss of confidentiality, integrity, and/or availability of information 
and resources. Thus the initial trust T can be viewed as an inverse function of the risk R: 


T=1/R (1) 
Where the risk of an entity i is a function of the impact imp: 


R ; = imp ; (confidentiality) e Pr imp ; (confidentiality) + 
imp i (integrity) è Pr imp ; (integrity) + imp ; (availability) e Pr imp ; (availability) (2) 


When a new entity is added, it has to register with the Mutual Authentication (MA) module and 
then its initial trust value can be quantified according to Equations 1 and 2. 


Verify Trust 


When an entity communicates with another entity, an Autonomic Trust Management (ATM) agent 
obtains the trust level of the entity that needs to interact with from the Trust Authority (TA), see 
Figure 6. If the trust level of the remote entity is below the minimum required trust level set in the 
policies, then the communication is dropped. By continuously checking with TA module, any 
interacting entities will not be able to communicate if they do not meet the end-to-end trust policies. 
Once the component trust level is verified, they can proceed and interact securely using the secure 
communications. 


Trust Mutual 
Authority Authentication 





End-to-End Communication 
Figure 6. Adaptive End-to-End Trust 


Adaptive Trust 
The trust value assigned to each component is not static and is updated continuously. The Trust 


Authority module is the one responsible for re-evaluating the trust at runtime. As mentioned in the 
previous section, the trust is measured per entity and the trust levels are between 0 and 1. 
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T(E) e [0,1] (3) 


Each interaction between entities is governed by a context C. Thus, trust level for entities is 
computed per context: 


T(E, C) e [0, 1] (4) 


A Forgiveness Factor, F, is assigned to provide an adaptive mechanism for compromised entities 
to start gaining trust after all existing vulnerabilities have been fixed. Based on the impact of the entity 
on the overall operations, we can control the time it takes for that entity to recover its trust level. 
Monitoring, measuring, and quantifying trust metrics are required, and they are performed by the 
ATM. M; will denote the collected trust metric, where i is the metric identifier. The function m() is a 
quantifying function that returns a measurement between 0 and 1 for the metric M;. 

The overall trust for an entity is computed using two types of trust: 1) self-measured trust and 2) 
reputation-measured trust. The self-measured trust T, is the trust that is evaluated based on the 
measurement performed by the ATM agent that manages the entity. While the reputation-measured 
trust, 7, is based on the trust metrics collected from peers based on a previous recent interaction with 
the entity for which the trust is being re-evaluated. The 7, and 7, are given by following equations: 


Ë 
Ts(E,C) = T(ATMr, O: X (C; m M) 
i=l 


K L 
TE.Q=% X TAMM O È O- mM) (5) 


j=l i=l 


The values of the metric weight /; for metric i is determined based on the feature selection 
technique, where: 


È AO =1 Q 


Based on the context and the type of operations, the end-to-end trust is evaluated using three trust 
evaluation strategies: Optimistic, Pessimistic, and Average. The end-to-end trust for each strategy can 
be evaluated as follows: 




















Trust Confidence Trust Evaluation Strategy 
Optimistic T(E, C) = max {Ts (E, C), Tp (E, C)} 
Average T(E, C) = ave {Ts (E, C), Tp (E, O} 
Pessimistic T(E, C) = min {Ts (E, C), Tp (E, C)} 








Once T(E,C) is computed, then it is mapped to the nearest of trust level: (High, Moderate, Low, 
and None). 

The Trust Authority module continuously evaluates the trust for all components and their entities 
whenever new metrics are obtained from the ATM agents that require an update to entity trust 
evaluation above depending on the trust evaluation strategy. Various reasoning evaluation strategies 
exist, such as that of Bayesian, Evidential Reasoning, and Belief Functions (Blasch, et al, 2013), that 
can be used to evaluate trust. 

In a DDDAS cyber environment, there are many levels of information fusion, but to build a 
trustworthy DDDAS environment, we need to check the trust of each level of information fusion. The 
Domain Trust Authority is the place to verify the trust of each entity passing information within the 
DDDAS environment. When the trust level drops below certain threshold; the incoming data can be 
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dropped to enable secure communications. What follows are the DDDAS theory, simulations, 
measurements, and software analysis for Information fusion levels of cyber data, situation/behavior 
assessment, information management, and user refinement. 


3.5 Bayes versus Evidential Reasoning 


A fundamental technique for data fusion is Bayes Rule. Recently, (Dezert, et al., 2012) has shown 
that Dempster’s rule is consistent with probability calculus and Bayesian reasoning if and only if the 
prior P(X) is uniform. However, when the P(X) is not uniform, then Dempster’s rule gives a different 
result. Both (Yen, 1986) and (Mahler, 1996) developed methods to account for non-uniform priors. 
Others have also tried to compare Bayes and evidential reasoning (ER) methods (Mahler, 2005, 
Blasch, et al., 2013). Assuming that we have multiple measurements Z = {Z), Z2, ..., Zn} for cyber 
detection D being monitored, Bayesian and ER methods are developed next. 


3.6 Relating Bayes to Evidential Reasoning 


Assuming conditional independence, one has the Bayes method: 


P(X | Z) P(X |Z) / P(X) 
PIZ N Z =~ l 2 (7) 


È P(X | Z1) PA | 22) / PX) 


i=1 


With no information from Z; or Z2, then P(X | Zi, Z2) = P(X). Without Z, then P(X | Zi, Z2) = P(X | 
Z,) and without Z,, then P(X | Zi, Z2) = P(X | Z2). Using Dezert’s formulation, then the denominator 
can be expressed as a normalization coefficient: 


my (Ø) = 1 - È PX |Z) PA |Z) (8) 
Xi;Xjl XiNXj 


Using this relation, then the total probability mass of the conflicting information is 


1 
P(X | Z N Z2) = fess my (2) 


e P(X| Z) PXI 2) (9) 

which corresponds to Dempster’s rule of combination using Bayesian belief masses with uniform 
priors. When the prior’s are not uniform, then Dempster’s rule is not consistent with Bayes’ Rule. For 
example, let mo (X) = P(X), mı (X) = P(X | Z1), and m (X) = P(X | Z2), then 


a 4) 1- ™Mo12 (2) N 
È PX) PŒ|Z) PŒ |Z) 
i=] 
Thus, methods are needed to deal with non-uniform priors and appropriately redistribute the 
conflicting masses. 


(10) 


3.7 Proportional Conflict Redistribution 


Recent advances in DS methods include Dezert-Smarandache Theory (DSmT). DSmT is an 
extension to the Dempster-Shafer method of evidential reasoning which has been detailed in 
numerous papers and texts: Advances and applications of DSmT for information fusion (Collected 
works), Vols. 1-3 (Dezert, et al., 2009). In (Dezert, et al., 2002) introduced the methods for the 
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reasoning and in presented the hyper power-set notation for DSmT (Dezert, et al., 2003). Recent 
applications include the DSmT Proportional Conflict Redistribution rule 5 (PCRS) applied to target 
tracking (Blasch, 2013). 

The key contributions of DSmT are the redistributions of masses such that no refinement of the 
frame © is possible unless a series of constraints are known. For example, Shafer’s model (Shafer, 
1976) is the most constrained DSm hybrid model in DSmT. Since Shafer’s model, authors have 
continued to refine the method to more precisely address the combination of conflicting beliefs 
(Josang, et al., 2006) and generalization of the combination rules (Smaradache, et al., 2005, Daniel, 
2006). An adaptive combination rule (Florea, et al., 2006) and rules for quantitative and qualitative 
combinations (Martin, 2008) have been proposed. Recent examples for sensor applications include 
electronic support measures, (Djiknavorian, et al., 2010), physiological monitoring sensors (Lee, et al., 
2010), and seismic-acoustic sensing (Blasch, et al., 2011). 

Here we use the Proportional Conflict Redistribution rule no. 5 (PCR5)*. We replace Smets’ rule 
(Smets, 2005) by the more effective PCR5 to cyber detection probabilities. All details, justifications 
with examples on PCRn fusion rules and DSm transformations can be found in the DSmT compiled 
texts (Dezert, et al., 2009 Vols. 2 & 3). A comparison of the methods is shown in Figure 7. 
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Figure 7. Comparison of Bayesian, Dempster-Shafer, and PCR5 Fusion Theories 


In the DSmT framework, the PCRS is used generally to combine the basic belief assignment 
(bba)’s. PCRS transfers the conflicting mass only to the elements involved in the conflict and 
proportionally to their individual masses, so that the specificity of the information is entirely 
preserved in this fusion process. Let m,(.) and m(.) be two independent bba’s, then the PCRS rule is 
defined as follows (see Dezert, et al., 2009, Vol. 2 for full justification and examples): mpcrs(@) = 0 
and VX € 2° \ {Ø}, where Ø is the null set and 2° is the power set: 


(Xi)? mX) X) mX)? 
mas = Dy mm + 2 Lich smc) arrears matty ms 
© 


© 
X1; X2€2 X2 €2 
Xi) NX2=X X2NX=G 





* Note: PCR used here is from information fusion technology and not the a Platform Configuration Register (PCR) of the 
Trusted Platform Module (TPM) hardware technology. 
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where f is the interesting and all denominators in the equation above are different from zero. If a 
denominator is zero, that fraction is discarded. Additional properties and extensions of PCRS for 
combining qualitative bba’s can be found in (Dezert, 2009, Vol. 2 & 3) with examples and results. All 
propositions/sets are in a canonical form. 


3.8 Example of DDDAS Cyber Trust Analysis 


In this example, we assume that policies are accepted and that the trust stack must determine 
whether the dynamic data is trustworthy. The application system collects raw measurements on the 
data intrusion (such as denial of service attacks) and situation awareness is needed. Conventional 
information fusion processing would include Bayesian analysis to determine the state of the attack. 
However, here we use the PCRS rule which distributes the conflicting information over the partial 
states. Figure 8 shows the results for a normal system being attacked and the different methods 
(Bayes, DS, and PCRS) to access the dynamic attack. Trust is then determined with percent 
improvement in analysis. Since the cyber classification of attack versus no attack is not consistent, 
there is some conflict in the processing of the measurement data going from an measurements of 
attack and vice versa. The constant changing of measurements requires acknowledgment of the 
change and data conflict as measured using the PCRS method. 
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Figure 8. Results of Bayesian, Dempster-Shafer, and PCR5 Fusion Theories for trust. 


The improvement of PCRS over Bayes is shown in Figure 8 and compared with the modest 
improvement from DS. The average performance improvement of PCRS is 46% and DS is 2%, which 
is data and application dependent. When comparing the results, it can be seen that when a system 
goes from a normal to an attack state, PCR5 responds quicker in analyzing the attack, resulting in 
maintaining trust in the decision. Such issues of data reliability, statistical credibility, and application 
survivability all contribute to the presentation of information to an application-based user. While the 
analysis is based on behavioral situation awareness, it is understood that polices and secure 
communications can leverage this information for domain trust analysis and authentication and 
authorization that can map measurements to software requirements. 


3.9 Policies Enforcement 


Policies are an important component of cyber trust (Blasch, 2012) as shown in Figure 9. As an 
example, a policy is administered for retrieval of information. Policy information determines the 
attributes for decisions. Determining the decision leads to enforcement. Such a decision is based on 
trust processing from which effective enforcement can support secure communications. 
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Figure 9. Policy-Based Fusion of Information requiring Trust (Blasch, 2012) 












































There are many possible information fusion strategies to enable data access from policies. Here we 
demonstrate an analysis of Bayesian versus evidential reasoning for determining cyber situation 
awareness trust. Future work includes threat intent (Shen, et al., 2009), impact assessment (Shen, et 
al., 2007), transition behaviors (Du, et al., 2011) and developing advanced forensics analysis (Yu, et 
al., 2013). 


4 Conclusions 


Information fusion (IF) and Dynamic Data-Driven Application Systems (DDDAS) are emerging 
techniques to deal with big data, multiple models, and decision making. One topic of interest to both 
fields of study is a measure of trust. In this paper, we explored a system for cyber security fusion 
which addresses system-level application issues of model building, data analysis, and polices for 
application trust. IF and data-driven applications utilize a common framework of probability analysis 
and here we explored a novel technique of PCRS that builds on Bayesian and Dempster-Shafer theory 
to determine trust. Future research would include real world data, complete analysis of the trust stack, 
and sensitivity of models/measurements in secure cyber situation awareness trust analysis. 
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sistenzsysteme, Simon Steinmeyer, Fakultat ftir Elektrotechnik, Informationstechnik, Physik der 
Technischen Universitat Carolo-Wilhelmina zu Braunschweig, Germany, Mai 13th, 2014 (in 
German). 

2014 — China - Ph.D. Thesis, Credal classification of uncertain data based on belief function 
theory, by Zhunga Liu, Northwestern Polytechnical University (NPU), Xi'an, China in co-tutelle 
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484 


Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 


Awards 
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The fourth volume on Advances and Applications of Dezert-Smarandache 
Theory (DSmT) for information fusion collects theoretical and applied 
contributions of researchers working in different fields of applications and in 
mathematics. The contributions (see List of Articles published in this book, at 
the end of the volume) have been published or presented after disseminating 
the third volume (2009, http://fs.gallup.unm.edu/DSmT-book3.pdf) in 
international conferences, seminars, workshops and journals. 


First Part of this book presents the theoretical advancement of DSmT, dealing 
with Belief functions, conditioning and deconditioning, Analytic Hierarchy 
Process, Decision Making, Multi-Criteria, evidence theory, combination rule, 
evidence distance, conflicting belief, sources of evidences with different 
importance and reliabilities, importance of sources, pignistic probability 
transformation, Qualitative reasoning under uncertainty, Imprecise belief 
structures, 2-Tuple linguistic label, Electre Tri Method, hierarchical 
proportional redistribution, basic belief assignment, subjective probability 
measure, Smarandache codification, neutrosophic logic, Evidence theory, 
outranking methods, Dempster-Shafer Theory, Bayes fusion rule, frequentist 
probability, mean square error, controlling factor, optimal assignment 
solution, data association, Transferable Belief Model, and others. 


More applications of DSmT have emerged in the past years since the apparition 
of the third book of DSmT 2009. Subsequently, the second part of this volume 
is about applications of DSmT in correlation with Electronic Support Measures, 
belief function, sensor networks, Ground Moving Target and Multiple target 
tracking, Vehicle-Born Improvised Explosive Device, Belief Interacting Multiple 
Model filter, seismic and acoustic sensor, Support Vector Machines, Alarm 
classification, ability of human visual system, Uncertainty Representation and 
Reasoning Evaluation Framework, Threat Assessment, Handwritten Signature 
Verification, Automatic Aircraft Recognition, Dynamic Data-Driven 
Application System, adjustment of secure communication trust analysis, and 
so on. 


Finally, the third part presents a List of References related with DSmT 
published or presented along the years since its inception in 2004, 
chronologically ordered. 














